The most alarming thing about AI health misinformation is not that it exists—it is that it sounds so utterly reasonable. A new study published in BMJ Open examined responses from five leading AI chatbots to common health queries and found that nearly half contained problematic information, ranging from subtle inaccuracies to outright fabrications complete with invented citations. The chatbots delivered these errors with the same measured, authoritative cadence they use for correct answers, offering users no reliable way to distinguish lifesaving guidance from potentially harmful nonsense.
The research team posed identical health questions across platforms and systematically evaluated the accuracy, sourcing, and safety of responses. What emerged was a portrait of systems optimised for fluency rather than truth—models that have learned to mimic the syntax of medical authority without internalising its epistemic humility.
The confidence problem
Large language models do not know what they do not know. They are trained to produce plausible continuations of text, which means hedging and uncertainty feel like bugs to be engineered away rather than features to be preserved. When a user asks whether they should worry about a persistent headache, the model that says "I'm not sure, please see a doctor" loses engagement metrics to the one that offers a crisp differential diagnosis. The incentive structure rewards false certainty.
This dynamic is particularly dangerous in healthcare, where the cost of overconfidence can be measured in delayed diagnoses and inappropriate self-treatment. The BMJ Open researchers noted that some chatbots cited journal articles that do not exist—a phenomenon known as hallucination that has plagued generative AI since its inception but takes on new gravity when the hallucinated source is a fabricated clinical trial.
Regulation lags behind adoption
Millions of people now consult AI chatbots for health information before, or instead of, speaking with a physician. The behaviour is understandable: chatbots are free, immediate, and never make you feel embarrassed for asking. But the regulatory frameworks that govern medical advice—licensing, malpractice liability, informed consent—were designed for human practitioners. AI systems slip through these structures entirely, offering what looks and feels like medical consultation without any of the accountability.
The companies behind these models typically bury disclaimers in terms of service, noting that their products are not intended to replace professional medical advice. But interface design tells a different story. When a chatbot responds to "What should I do about chest pain?" with a detailed, confident answer, the disclaimer feels like a legal formality rather than a genuine warning.
Our take
The study's findings are not surprising to anyone who has stress-tested these systems, but they deserve wide attention precisely because most users have not. We are witnessing a massive, uncontrolled experiment in which AI companies have deployed persuasive medical oracles to a global population, then disclaimed responsibility for the consequences. The technology is not ready for this role, and the absence of regulation is not permission—it is negligence. Until these systems can reliably distinguish what they know from what they have merely learned to say, they should come with friction, not fluency, when the stakes are life and death.




