ChatGPT, Claude, and several other chatbots are offering a quick way for patients to collect and access their records, but the tools they are using are not always secure, or accurate.

By John Halamka, M.D., M.S., Dwight and Dian Diercks President, Mayo Clinic Platform and Paul Cerrato, MA, senior research analyst and communications specialist, Mayo Clinic Platform
Technology companies are increasing their involvement with the healthcare ecosystem by launching chatbots that they say will help collect a patient’s medical records, act as a personal health assistant, and much more. Amazon recently introduced Health AI, for instance, Microsoft has launched Copilot for Health, OpenAI is offering ChatGPT Health, and Anthropic Claude for Healthcare. Ada Health, Buoy, and Woebot Health are already deeply involved in digital health. While each of these companies offers glowing descriptions of their potential benefits, independent research provides a more realistic picture.
OpenAI is slowly introducing ChatGPT Health, stating that it will be possible for a patient to upload their personal healthcare data, including lab and imaging results and content from an Apple Watch to its website. After consulting with hundreds of physicians in 60 countries, the chatbot believes it can offer advice on: “how urgently to encourage follow-ups with a clinician, how to communicate clearly without oversimplifying, and how to prioritize safety in moments that matter.” And while an OpenAI representative says it’s not the role ChatGPT Health to tell users if they are sick or healthy or make a diagnosis or treatment recommendation, the reality is many patients will use the tool to do just that.
Investigators recently evaluated the ability of ChatGPT Health to triage 60 patient scenarios provided by clinicians and found serious shortcomings: “Performance followed an inverted U-shaped pattern, with the most dangerous failures concentrated at clinical extremes—nonurgent presentations (35%) and emergency conditions (48%).” Had users followed the app’s advice, patients with diabetic ketoacidosis or impending respiratory failure would have been under-triaged in 52% of the cases analyzed, and would have been advised to have their condition evaluated in 24-48 hours rather than instructing them to go to the emergency department.
Unfortunately, because many of the other health-related chatbots are so new there is virtually no independent research to evaluate their benefits and risks. There are, however, reports on their generic predecessors. When study participants were asked to use ChatGPT-4o, Llama2, and Command R+ to identify the underlying conditions in 10 medical scenarios, they were only able to identify the correct disorder in fewer than 34.5% of cases and the correct course of action in fewer than 44.2%. But when these same 10 scenarios were given directly to the LLMs for analysis, the results were quite different. The LLMs correctly identified the conditions 94.9% of the time and make the right call on what to do with the diagnosis 56.3% on average, suggesting that one of the problems with these interactive apps is the interface and how patients use it.
The other issue that needs to be addressed when patients use these new chatbots is security. Most LLMs that are designed to share patient data are not HIPAA compliant. The U.S. federal government’s HIPAA privacy rule explains: “A major goal of the Privacy Rule is to assure that individuals' health information is properly protected while allowing the flow of health information needed to provide and promote high quality health care and to protect the public's health and well being. The Rule strikes a balance that permits important uses of information, while protecting the privacy of people who seek care and healing.” For that to happen, “covered entities” have to adhere to certain security rules spelled out by the Health Insurance Portability and Accountability Act. But since chatbots are not covered entities, they aren’t required to follow these rules. These rules require healthcare providers to adhere to several administrative, physical, and technological safeguards. More specifically, they have to conduct a detailed risk analysis to detect any potential threat to a patient’s protected health information, train their staff to be alert for such threats, secure their computers, and more. Just because chatbots are not required to follow all these precautions doesn’t mean the data they manage is safe and secure.
Unfortunately, many patients don’t seem to see the danger. By one estimate, a third of U.S. adults use AI for health advice and about 40% of this group has uploaded the personal medical information to a chatbot. This overreliance on AI brings to mind the age-old warning: Buyer beware.
