Researchers compared a simulated autonomous AI triaging strategy with double reading or single reading by radiologists in a retrospective analysis of 15,987 digital mammography (DM) and digital breast tomosynthesis (DBT) images. The images came from the Córdoba Tomosynthesis Screening Trial and were acquired between January 2015 and December 2016. The examinations included 98 screening-detected and 15 interval cancers.
Mammograms were read and scored from 1 to 10 depending on the likelihood of the presence of a visible cancer by version 1.6.0 of Transpara AI software from ScreenPoint Medical. Of all the examinations, 70% received a risk score of 7 or less, which means that they were very likely to be normal, and thus were read only by the AI.
The researchers based this 70% cutoff on prior research showing that replacing double reading with single reading for less suspicious cases would not reduce sensitivity by more than 5%, explained José Luis Raya-Povedano, MD, lead author of the study and a radiologist at Reina Sofia University Hospital in Córdoba, Spain, in an interview with Medscape Medical News. The 30% of mammograms with a score of over 7 deemed highest risk were reviewed by one or two radiologists. Mammograms not flagged by radiologists but determined to be within the 2% most suspicious examinations by the AI were also recalled.
Three screening strategies were used in the Córdoba Tomosynthesis Screening Trial: double reading DM, double reading DBT, and single reading DBT. Raya-Povedano’s team compared these strategies with the AI-based screening strategy. Additionally, they compared the screening performance of DM alone to a combined AI-DBT strategy.
Compared with double reading, using AI with DBT resulted in a 72.4% decrease in radiologist workload (568 to 156 hours needed; P < .001), a 16.7% lower recall rate (706 to 588 recalls in 15,987 exams; P < .001), and noninferior sensitivity (92 vs 95 of 113 cancers detected; P = .38).
Similarly, supplementing DM with AI led to a 71.5% decrease in workload (222 vs 63 hours needed; P < .001), 16.9% lower recall rate (807 vs 671 recalls in 15,987 exams; P < .001), and noninferior sensitivity (76 vs 78 of 113 cancers detected; P = .68) compared with double reading DM.
Finally, compared with double reading DM, AI with DBT resulted in a 29.7% workload decrease (P < .001), a 27.1% lower recall rate (P < .0001), and a 25% increase in sensitivity (P < .001) to using DBT alone. Typically, DBT images can take twice as long for radiologists to read compared with DM. However, with AI, it may be possible to “move from using digital mammograms to digital breast tomosynthesis,” Raya-Povedano said.
Although the study findings are promising, its retrospective design raises the question of how radiologists would perform if they were working alongside AI in real time.
“I think it’s fairly likely that, if the radiologists were told that a certain group has a higher pretest probability of having cancer, that would influence how they interpret the exam,” said Connie Lehman, MD, PhD, a radiologist and chief of breast imaging at Massachusetts General Hospital in Boston, in an interview. It’s possible that the rate of recalls and false positives may increase, she said.
The study also used a relatively small sample of data from a single site and features a population that the authors note is predominantly White. Applying AI to diverse study populations is important in order to identify and mitigate algorithmic biases, said Lehman.
Algorithmic bias is defined as “the instances when the application of an algorithm compounds existing inequities in socioeconomic status, race, ethnic background, religion, gender, disability, or sexual orientation to amplify them and adversely impact inequities in health systems.”
The findings may also not be directly generalizable to some countries. In the United States, for instance, double reading of mammograms isn’t standard practice for several reasons, including the prohibitive cost, said Hari Trivedi, MD, a radiologist at Emory University in Atlanta, Georgia, in an interview. However, Trivedi said there may be a role for AI as the “second reader” in the United States.
Moving forward, the researchers plan to conduct a prospective trial using AI in breast cancer screening, said Raya-Povedano.
The study was independently supported. Raya-Povedano, Lehman, and Trivedy have reported no relevant financial relationships.
Radiology. Published online May 4, 2021. Full text
Anna Goshua is a reporting intern with Medscape. She is a dual medical and journalism student who has previously written for STAT, Scientific American, Slate, and other outlets. She can be reached at [email protected] or @AnnaGoshua.