Improved assessment of donor liver steatosis using Banff consensus recommendations and deep learning algorithms

Gambella, Alessandro; Salvi, Massimo; Molinaro, Luca; Patrono, Damiano; Cassoni, Paola; Papotti, Mauro; Romagnoli, Renato; Molinari, Filippo

doi:10.1016/j.jhep.2023.11.013

Background & aims: The Banff Liver Working Group recently published consensus recommendations for steatosis assessment in donor liver biopsy, but few studies reported their use and no automated deep-learning algorithms based on the proposed criteria have been developed so far. We evaluated Banff recommendations on a large monocentric series of donor liver needle biopsies by comparing pathologists' scores with those generated by convolutional neural networks (CNNs) we specifically developed for automated steatosis assessment. Methods: We retrospectively retrieved 292 allograft liver needle biopsies collected between January 2016 and January 2020 and performed steatosis assessment using a former intra-institution method (pre-Banff method) and the newly introduced Banff recommendations. Scores provided by pathologists and CNN models were then compared, and the degree of agreement was measured with the intraclass correlation coefficient (ICC). Results: Regarding the pre-Banff method, poor agreement was observed between the pathologist and CNN models for small droplet macrovesicular steatosis (ICC: 0.38), large droplet macrovesicular steatosis (ICC: 0.08), and the final combined score (ICC: 0.16) evaluation, but none of these reached statistically significance. Interestingly, significantly improved agreement was observed using the Banff approach: ICC was 0.93 for the low-power score (p <0.001), 0.89 for the high-power score (p <0.001), and 0.93 for the final score (p <0.001). Comparing the pre-Banff method with the Banff approach on the same biopsy, pathologist and CNN model assessment showed a mean (±SD) percentage of discrepancy of 26.89 (±22.16) and 1.20 (±5.58), respectively. Conclusions: Our findings support the use of Banff recommendations in daily practice and highlight the need for a granular analysis of their effect on liver transplantation outcomes. Impact and implications: We developed and validated the first automated deep-learning algorithms for standardized steatosis assessment based on the Banff Liver Working Group consensus recommendations. Our algorithm provides an unbiased automated evaluation of steatosis, which will lay the groundwork for granular analysis of steatosis's short- and long-term effects on organ viability, enabling the identification of clinically relevant steatosis cut-offs for donor organ acceptance. Implementing our algorithm in daily clinical practice will allow for a more efficient and safe allocation of donor organs, improving the post-transplant outcomes of patients.