Hacker News new | past | comments | ask | show | jobs | submit
Last time I checked thoroughly (roughly two years ago), AI (in the form of small ML models) mostly outperformed radiologists in areas where the gold standard is "one level" above imagining wise. By that I mean that you train a model to detect on an X-ray what would normally need a CT. Or train it to see on a non-contrast CT what would normally need contrast or an MRI, or biopsy, and so on.

Essentially the cutting edge reaches up to 99% of human performance on the task it is trained, which is good enough for triage but not for a final diagnosis. However, magic sometimes happens when you train a model to detect something, which you already know is there, on an examination that is cheaper, faster or less invasive than the human"gold standard". Conveniently, this dataset exists since it's common to first do a cheap examination like an X-ray, and then escalate if nothing is found (or if something is found that you want to see better, or a number of other possibilities).

Examples of AI outperforming humans like this includes AI detecting sacral fractures on x-rays better than radiologists (who normally take a CT to conclusively exclude it), detecting potential precursors to pancreatic cancer on non-contrast CTs (where contrast or an MRI is usually required) and detecting an occluded coronary artery on an ECG without the archetypical "ST-elevation changes".

See the link below for references: https://pmc.ncbi.nlm.nih.gov/articles/PMC9478257/ https://www.nature.com/articles/s41591-023-02640-w https://rebelem.com/a-winning-hand-in-cardiology/

So AI, as a general rule, doesn't usually match or exceed the upper bound of the "gold standard" medical performance. But it tends to carry the quality of the upper bound downwards towards the faster, less expensive and invasive methods. In some cases, like in the case of EKGs, that's huge. In some cases it saves time, in some cases it decreases miss rates from tired radiologists or triages their review feed. And in some cases it's not very useful.

LLMs doesn't come close to specialized radiology models at the moment, because LLMs are more about applying knowledge than creating new correlations. Of course that's also hugely useful, but that's a bit of a different topic to unpack.