October 16, 2024
Study Finds ChatGPT Overprescribes Unnecessary X-Rays and Antibiotics in Emergency Care
Health & Medicine National Science & Tech

Study Finds ChatGPT Overprescribes Unnecessary X-Rays and Antibiotics in Emergency Care

A recent study has revealed that while ChatGPT demonstrates promise in patient interactions and medical examinations, it tends to overprescribe unnecessary x-rays and antibiotics in emergency care settings.

Conducted by researchers from the University of California-San Francisco (UCSF), the study indicated that ChatGPT sometimes admitted patients who did not require hospital treatment. Published in Nature Communications, the research emphasizes that although the AI can be prompted for more accurate responses, it still falls short of the clinical judgment provided by human doctors.

“This serves as a crucial reminder to clinicians not to rely on these models blindly,” said lead author Chris Williams, a postdoctoral scholar at UCSF. He noted, “ChatGPT can handle medical exam questions and assist with clinical documentation, but it’s not suited for complex decision-making required in emergency departments.”

In a prior study, Williams found that ChatGPT slightly outperformed humans in determining which of two emergency patients was more acutely ill. However, the current study tasked the AI with more complicated decisions typically made after an initial patient examination—such as whether to admit a patient, order x-rays or other scans, or prescribe antibiotics.

The research team analyzed 1,000 emergency visits from a database of over 251,000 cases, maintaining the same ratio of “yes” to “no” responses across the three types of decisions. They entered physicians’ notes detailing each patient’s symptoms and examination results into both ChatGPT-3.5 and ChatGPT-4, then tested the accuracy of the AI’s recommendations with increasingly detailed prompts.

The findings showed that the AI models frequently recommended unnecessary services. ChatGPT-4 was found to be 8 percent less accurate than resident physicians, while ChatGPT-3.5 was 24 percent less accurate.

“AI models tend to overprescribe because they are trained on internet data, and as of now, there are no reliable medical advice platforms specifically designed to address emergency medical queries,” the researchers concluded.

Leave a Reply

Your email address will not be published. Required fields are marked *