Artificial intelligence to detect brain hemorrhages better than expert radiologists

Artificial intelligence to detect brain hemorrhages better than expert radiologists

An algorithm developed by scientists did better than two out of four expert radiologists at finding tiny brain hemorrhages in head scans--an advance that one day may help doctors treat patients with traumatic brain injuries (TBI), strokes and aneurysms.

The continued increase in diagnostic imaging studies, including 3D imaging studies such as computed tomography (CT), means that radiologists are looking at thousands of images each day, searching for tiny abnormalities that can signal life-threatening emergencies. The number of images from each brain scan can be so large that on a busy day, radiologists may opt to scroll through some large 3D stacks of images using mice with frictionless wheels, almost like viewing a movie. But it could be much more efficient--and potentially more accurate--if AI technology could pick out the images with significant abnormalities, so radiologists could examine them more closely.

"We wanted something that was practical, and for this technology to be useful clinically, the accuracy level needs to be close to perfect," said the co-corresponding author of the study, published in Proceedings of the National Academy of Sciences (PNAS). "The performance bar is high for this application, due to the potential consequences of a missed abnormality, and people won't tolerate less than human performance or accuracy."

The algorithm the team developed took just one second to determine whether an entire head scan contained any signs of hemorrhage. It also traced the detailed outlines of the abnormalities it found--demonstrating their location within the brain's three-dimensional structure. Some spots may be on the order of 100 pixels in size, in a 3D stack of images containing over a million of them, and even expert radiologists sometimes miss them, with potentially grave consequences.

The algorithm found some small abnormalities that the experts missed. It also noted their location within the brain, and classified them according to subtype, information that physicians need to determine the best treatment. And the algorithm provided all of this information with an acceptable level of false positives--minimizing the amount of time that physicians would need to spend reviewing its results.

The author said one of the hardest things to achieve with the AI technology was the ability to determine whether an entire exam, consisting of a 3D "stack" of approximately 30 images, was normal.

"Achieving 95 percent accuracy on a single image, or even 99 percent, is not OK, because in a series of 30 images, you'll make an incorrect call on one of every 2 or 3 scans," the author said. "To make this clinically useful, you have to get all 30 images correct--what we call exam level accuracy. If a computer is pointing out a lot of false positives, it will slow the radiologist down, and may lead to more errors."

The radiology experts said the algorithm's ability to find very small abnormalities and demonstrate their location in the brain was a substantial advance.

"The hemorrhage can be tiny and still be significant," said another author. "That's what makes a radiologist's job so hard, and that's why these things occasionally get missed. If a patient has an aneurysm, and it's starting to bleed, and you send them home, they can die."

The new study made use of a type of deep learning known as a fully convolutional neural network, or FCN, which trains algorithms on a relatively small number of images, in this case 4,396 CT exams. But the training images used by the researchers were packed with information, because each small abnormality was manually delineated at the pixel level. The richness of this data--along with other steps that prevented the model from misinterpreting random variations or "noise" as meaningful--created an extremely accurate algorithm.

The scientists could have chosen to feed an entire stack of images, or one complete image, all at once. Instead, they chose to feed only a portion or "patch" of an image at a time, contextualizing this image with the ones that directly preceded and followed it in the stack. Viewing an image in patches is also how people read text or look at a computer screen, and this enabled the network to learn from the relevant information in the data without "overfitting" the model by drawing conclusions based on insignificant variations that were also present in the data. They called their model PatchFCN.

"We took the approach of marking out every abnormality--that's why we had much, much better data," said another co-corresponding author of the study. "Then we made the best use possible of that data. That's how we achieved success."

The authors are now applying the algorithm to CT scans from trauma centers across the country that are enrolled in a research study. "Given the large number of people who suffer from traumatic brain injury every day and are rushed to the emergency department, this has very big clinical importance," the author said. "That convinced me to work on this problem."