https://archive.is/wtjuJ

Errors with Google’s healthcare models have persisted. Two months ago, Google debuted MedGemma, a newer and more advanced healthcare model that specializes in AI-based radiology results, and medical professionals found that if they phrased questions differently when asking the AI model questions, answers varied and could lead to inaccurate outputs.

In one example, Dr. Judy Gichoya, an associate professor in the department of radiology and informatics at Emory University School of Medicine, asked MedGemma about a problem with a patient’s rib X-ray with a lot of specifics — “Here is an X-ray of a patient [age] [gender]. What do you see in the X-ray?” — and the model correctly diagnosed the issue. When the system was shown the same image but with a simpler question — “What do you see in the X-ray?” — the AI said there weren’t any issues at all. “The X-ray shows a normal adult chest,” MedGemma wrote.

In another example, Gichoya asked MedGemma about an X-ray showing pneumoperitoneum, or gas under the diaphragm. The first time, the system answered correctly. But with slightly different query wording, the AI hallucinated multiple types of diagnoses.

“The question is, are we going to actually question the AI or not?” Shah says. Even if an AI system is listening to a doctor-patient conversation to generate clinical notes, or translating a doctor’s own shorthand, he says, those have hallucination risks which could lead to even more dangers. That’s because medical professionals could be less likely to double-check the AI-generated text, especially since it’s often accurate.

  • GenderNeutralBro@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    10
    ·
    2 days ago

    The “free market” solution is for malpractice suits to be so ruinously expensive that insurance companies will apply sufficient pressure to medical practices to actually do their fucking jobs.

    Same in the legal field, plus we should see a wave of disbarments already.

    I’m not holding my breath. AI is shaping up to be history’s greatest accountability sink and I’ve yet to see any meaningful pushback.

  • Euphoma@lemmy.ml
    link
    fedilink
    English
    arrow-up
    9
    ·
    2 days ago

    doctors shouldnt be using llms or any generative ai that makes up bs.

    image classification ai / other classification ai’s are to be preferred since they literally have a built in confidence percentage and cant start making things up using words

  • jarfil@beehaw.org
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    2 days ago

    [sarcasm] Too many words bad… next time, try this one: “Whaaat?” [/sarcasm]

    Seriously, people really need to learn to use AI as a tool, not as an omniscient oracle, and not as an idiot baby.

  • MagicShel@lemmy.zip
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    2
    ·
    2 days ago

    I don’t understand why people expect AI to work this way. First, you don’t give it the most information-free prompt you possible can. Second, it would be far better at discussing a diagnosis with an expert than just pronouncing a verdict.

    It would be much better to provide as much patience demographic information as possible and then say something like:

    • “Do you see anything suspicious or abnormal about [thing]?”
    • “What are some possible causes of [unusual spot]?”
    • “I suspect [diagnosis]. Identify and explain features of this image that either confirm or don’t support that conclusion. Is there a diagnosis that fits better or is more likely?”

    Don’t rely on AI to perform the work, use it to make an expert faster or challenge them to be more accurate.

    I don’t exactly know how medical AI works, but the fact that they are discussion prompts suggests LLMs play a role here and they can’t be trusted to function without an expert user.