Diagnostic Performance of AI for Cancers Registered in A Mammography Screening Program: A Retrospective Analysis

Abstract

Purpose: To evaluate the performance of an artificial intelligence (AI) algorithm in a simulated screening setting and its effectiveness in detecting missed and interval cancers. Methods: Digital mammograms were collected from Bahcesehir Mammographic Screening Program which is the first organized, population-based, 10-year (2009-2019) screening program in Turkey. In total, 211 mammograms were extracted from the archive of the screening program in this retrospective study. One hundred ten of them were diagnosed as breast cancer (74 screen-detected, 27 interval, 9 missed), 101 of them were negative mammograms with a follow-up for at least 24 months. Cancer detection rates of radiologists in the screening program were compared with an AI system. Three different mammography assessment methods were used: (1) 2 radiologists' assessment at screening center, (2) AI assessment based on the established risk score threshold, (3) a hypothetical radiologist and AI team-up in which AI was considered to be the third reader. Results: Area under curve was 0.853 (95\% CI = 0.801-0.905) and the cut-off value for risk score was 34.5\% with a sensitivity of 72.8\% and a specificity of 88.3\% for AI cancer detection in ROC analysis. Cancer detection rates were 67.3\% for radiologists, 72.7\% for AI, and 83.6\% for radiologist and AI team-up. AI detected 72.7\% of all cancers on its own, of which 77.5\% were screen-detected, 15\% were interval cancers, and 7.5\% were missed cancers. Conclusion: AI may potentially enhance the capacity of breast cancer screening programs by increasing cancer detection rates and decreasing false-negative evaluations.

Description

Keywords

artificial intelligence, breast cancer, deep learning, mammography, screening

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By