ChatGPT Santé : L'IA médicale rate une urgence vitale sur deux, l'alerte !

A new study raises serious concerns about the safety of using artificial intelligence for initial medical assessments. Researchers found that OpenAI’s ChatGPT Santé incorrectly assessed over half of urgent medical cases, potentially leading to delayed or inappropriate care. This finding highlights the critical need for caution when relying on AI tools for health-related advice.

Approximately 40 million people use ChatGPT Santé daily, according to OpenAI, just weeks after its quiet launch in January 2026. These users input their symptoms, pains, and breathing difficulties into the chatbot and receive triage recommendations – an assessment of how urgently they should seek medical attention – generated by a language model. Despite this, the tool includes a disclaimer stating it is “not intended for diagnosis or treatment.”

Publicité, votre contenu continue ci-dessous Publicité

Researchers at the Icahn School of Medicine at Mount Sinai in New York evaluated the tool’s safety. Their study, published February 23 in Nature Medicine, represents the first independent safety assessment of ChatGPT Santé. The researchers presented the AI with 60 clinical scenarios, covering 21 medical specialties, under 16 different contextual conditions – including gender, ethnicity, the presence of someone downplaying symptoms, and lack of health insurance. This resulted in a total of 960 interactions, which were then compared to the consensus of three independent physicians using guidelines from 56 professional societies.

AI Recognizes Danger, Then Offers Reassurance

The findings were stark. In cases physicians deemed immediately urgent, ChatGPT Santé failed to identify the severity in 52% of instances. Patients experiencing diabetic ketoacidosis or imminent respiratory distress were advised to seek medical attention within 24 to 48 hours. While the AI correctly identified stroke and anaphylactic shock, these were situations where the diagnosis was already clear.

Publicité, votre contenu continue ci-dessous Publicité

It’s incredibly dangerous. If you’re in respiratory failure or diabetic ketoacidosis, you have a one in two chance that this AI will tell you it’s not serious.

Alex Ruani, doctoral researcher, University College London, via The Guardian

Even more concerning, in a scenario involving severe asthma, the system identified signs of developing respiratory failure in its explanation, before concluding with advice to wait. The AI acknowledged the danger and then downplayed it in the same response. This finding echoes another study published in the same journal weeks earlier, which showed that large language models, despite achieving near-perfect scores on medical exams, do not improve patient decisions.

Another significant bias was the influence of surrounding individuals. When a simulated family member minimized the severity of symptoms, the likelihood of the AI downgrading the urgency level increased nearly twelvefold (OR 11.7). Conversely, 64.8% of patients without an urgent condition were incorrectly sent to the emergency room.

Faulty Suicide Risk Detection

Perhaps the most alarming finding relates to the detection of suicidal risk.

ChatGPT Santé is designed to display a banner directing users in the U.S. To the 988 crisis lifeline when danger is detected. Yet, alerts appeared more often when the patient did not describe a specific plan for self-harm than when they detailed a concrete plan. A fictional 27-year-old patient stating they were thinking about swallowing pills consistently triggered the alert. But adding normal biological results to the same scenario, with identical wording, caused the banner to disappear in 100% of cases.

A crisis safeguard that depends on whether you mentioned your blood tests isn’t ready. It’s probably more dangerous than the total absence of a safeguard, because no one can predict when it will stop working.

Dr. Ashwin Ramaswamy, lead author, Icahn School of Medicine at Mount Sinai, via The Guardian

OpenAI responded that the study did not reflect the real-world use of its tool and that its models continue to be improved. However, this argument is unconvincing given that the product is already deployed to tens of millions of users without prior external validation. The ECRI, an independent patient safety organization, had already classified the misuse of health chatbots as the top technology risk for 2026.

The Mount Sinai team plans to continue its evaluations, including in pediatrics, medication safety, and with languages other than English. Until then, the authors’ advice remains the same: do not seek advice from a machine when experiencing concerning symptoms.

Publicité, votre contenu continue ci-dessous Publicité

Want to save even more? Discover our selected promo codes for you.

AI Recognizes Danger, Then Offers Reassurance

Faulty Suicide Risk Detection

About Us

Recent Articles

Ads

ChatGPT Santé : L’IA médicale rate une urgence vitale sur deux, l’alerte !

AI Recognizes Danger, Then Offers Reassurance

Faulty Suicide Risk Detection

Trump Claims US Eliminated Iran’s Successors Too | ABC News Interview

Ramadan 2026 Drama: Second Half Lineup & Predictions

You may also like

Leave a Comment Cancel Reply

About Us

Recent Articles

Ads