Purpose: To retrospectively assess the agreement between human and automated AI-based readings for low-dose computed tomography (LDCT) outcomes according to LungRADS v1.1 in lung cancer screening (LCS); to test the diagnostic performance of both readings. Methods: We included 4104 baseline LDCTs from the BioMILD trial. Original readings were retrospectively classified into "negative" (LungRADSv1.1 categories 1, 2) and "positive" (categories 3, 4) by a radiologist and analyzed by AI software for category assignment. Diagnosis of lung cancer (LC) at 2 years served as reference standard to assess sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV) of both human and AI. Agreement between readers was measured by the k-Cohen Index with Fleiss-Cohen weights (Kw) with 95 % CI. Results: Median age of participants was 60 years; 60.8 % were male and 79.2 % current smokers; 68/4104 (1.7 %) were diagnosed with LC; 6/68 (8.8 %) and 7/68 (10.3 %) LDCT were classified as negative by AI and human reading, respectively. The agreement between human and AI readings for negative and positive LDCTs was 83.5 % (Kw 0.47; 95 %CI: 0.43-0.50). Sensitivity and specificity were 91.2 % and 75.7 % for AI, and 89.7 % and 90.0 % for human reading (p-value 0.5637 and < 0.0001). PPV and NPV were 6.0 % and 99.8 % for AI, and 13.1 % and 99.8 % for human reading (p-value < 0.0001 and 0.9351). The expected reduction in LDCT reading workload when using AI as first reader was 74.7 %. Conclusion: AI reading showed comparable sensitivity but lower specificity than human reading. High NPV of AI may support its use as a first reader in LCS.

Potential for AI as first reader in lung cancer screening

Balbi, Maurizio;
2026-01-01

Abstract

Purpose: To retrospectively assess the agreement between human and automated AI-based readings for low-dose computed tomography (LDCT) outcomes according to LungRADS v1.1 in lung cancer screening (LCS); to test the diagnostic performance of both readings. Methods: We included 4104 baseline LDCTs from the BioMILD trial. Original readings were retrospectively classified into "negative" (LungRADSv1.1 categories 1, 2) and "positive" (categories 3, 4) by a radiologist and analyzed by AI software for category assignment. Diagnosis of lung cancer (LC) at 2 years served as reference standard to assess sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV) of both human and AI. Agreement between readers was measured by the k-Cohen Index with Fleiss-Cohen weights (Kw) with 95 % CI. Results: Median age of participants was 60 years; 60.8 % were male and 79.2 % current smokers; 68/4104 (1.7 %) were diagnosed with LC; 6/68 (8.8 %) and 7/68 (10.3 %) LDCT were classified as negative by AI and human reading, respectively. The agreement between human and AI readings for negative and positive LDCTs was 83.5 % (Kw 0.47; 95 %CI: 0.43-0.50). Sensitivity and specificity were 91.2 % and 75.7 % for AI, and 89.7 % and 90.0 % for human reading (p-value 0.5637 and < 0.0001). PPV and NPV were 6.0 % and 99.8 % for AI, and 13.1 % and 99.8 % for human reading (p-value < 0.0001 and 0.9351). The expected reduction in LDCT reading workload when using AI as first reader was 74.7 %. Conclusion: AI reading showed comparable sensitivity but lower specificity than human reading. High NPV of AI may support its use as a first reader in LCS.
2026
195
1
5
Artificial intelligence; Low-dose computed tomography; Lung cancer screening
Ledda, Roberta Eufrasia; Valsecchi, Camilla; Sabia, Federica; Milanese, Gianluca; Balbi, Maurizio; Rolli, Luigi; Ruggirello, Margherita; Sverzellati, ...espandi
File in questo prodotto:
File Dimensione Formato  
PIIS0720048X25006473.pdf

Accesso aperto

Dimensione 793 kB
Formato Adobe PDF
793 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2121656
Citazioni
  • ???jsp.display-item.citation.pmc??? 3
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact