back to article Healthcare org with over 100 clinics uses OpenAI's GPT-4 to write medical records

US healthcare chain Carbon Health has introduced an AI tool to generate medical records automatically, based on conversations between physicians and their patients.  If a patient consents to having their meeting recorded and transcribed, the audio recording is passed to Amazon's AWS Transcribe Medical cloud service, which …

  1. Eclectic Man Silver badge

    A surgeon insisted that his students work from drawings they had each made of a patient, not photographs. This ensured that the student had actually noticed and recorded relevant details, rather than relying on some other entity (in that case a camera) to 'notice things'.

    I do hope that there were extensive trials and ongoing checks to ensure that what a clinician would record about an examination and interview is correctly 'captured' by GPT-4. It's propensity for 'making things up' is worrying at best of times, but for medical needs certainly quite scary.


    "After a New York attorney admitted last week to citing non-existent court cases that had been hallucinated by OpenAI's ChatGPT software, a Texas judge has directed attorneys in his court to certify either that they have not used artificial intelligence to prepare their legal documents – or that if they do, that the output has been verified by a human."

    1. druck Silver badge

      The good news is that two of the three conditions the AI has diagnosed you with don't actually exist, but the bad news is....

      1. fajensen

        ... That now they are in our system we have to charge you for the treatment of them.

    2. Doctor Syntax Silver badge

      "I do hope that there were extensive trials"

      These should be properly designed and conducted clinical trials such as would be expected for any other medical device. I know Covid introduced new approaches to speed up clinical trials but even so GPT-4 is of such recent introduction there doesn't seem to have been much time for those.

      1. Cybersaber

        Trials are not necessary. Third party medical transcription has been a thing for half a century almost. As long as doctors treat the output as coming from a shady company that hires patients from the psychiatric ward to do the transcription, and applies the right level of (complete) distrust of the output...

        Existing procedures and systems can take care of things from there. Though from an economic perspective, I'm not sure how much time/effort it will save all but the slowest typists re: having to review every line carefully and triple-check measurements/dosages, etc.

        My fear is will the law hold doctors accountable for choosing to cut corners on costs by using (so-called) AI, but NOT applying the right due diligence to checking the accuracy of the transcription - and then attempting shift blame to the tool.

        1. Ken Hagan Gold badge

          According to the article, the transcription is then digested by the AI, so this is going way beyond the prior art you mention.

          I think trials are necessary and a "device" malfunction could be fatal so presumably the highest standards of supporting evidence would be necessary.

          Or we could just sue the doctors for mal-practice. It's their choice.

          1. Eclectic Man Silver badge

            Agree, there have been useful medical diagnostic systems, for very specific areas, such as eye diseases, and the computer system rarely forgets to ask all of the questions, if it has been coded properly. But these were not the current crop of LLMs. I wonder, does GPT-4 understand euphemisms for bodily parts, activities etc? If the 'AI' is only given the audio recording, what about understanding accents, and any images made of injuries, wounds, possible cancerous growths, rashes, etc? I also wonder whether a clinician writing up their interview notes could notice things and make connections to check for other things, rather than just transcribing a con version. Any medical doctors here, please advise.

            1. that one in the corner Silver badge

              The use of GPT4 given here is *not* for any diagnostic purpose.

              > If the 'AI' is only given the audio recording, what about understanding accents

              As already said, GPT4 is getting the transcript, *not* the audio - it does not have to deal with accents

              > and any images made of injuries, wounds, possible cancerous growths, rashes, etc?

              GPT4 is *not* being asked to provide any diagnosis, it is not examining any images!

              It is being used to convert the transcription - the conversation, with all the repetitions, hesitations and irrelevant "no need to be embarassed, I've heard it all before" comments - into a standardised format with just the useful info retained. And they are claiming that 12% of its output needs to be edited (corrected? deleted? made to sound sane!).

              The article points out that the text is still being verified by the doctor (who is then correcting the expected 12% of garbage).

              So the patients are still relying on the doctor matching up the output to the relevant patient (in case GPT4 hallucinates an entire case history) and spotting when something that. sounds good (remembering that LLMs have a tendency to sound convincing, with good grammar etc) is actually inaccurate.

              1. LybsterRoy Silver badge

                -- the text is still being verified by the doctor --

                Is that like the driver who is going to take over when the Tesla can't make its mind up?

                1. Anonymous Coward
                  Anonymous Coward

                  My thought exactly. And when it's improved so it's 99% correct you start to get the situation where the doctor skim reads it (as it's never wrong) and becomes so trusting he doesn't check it properly or even anymore. Until the time it makes a critical error... Just like that Tesla driver that takes a nap while his Tesla drives him to and from work until the day he doesn't wake up from that nap because the Tesla got confused...

                  1. that one in the corner Silver badge

                    Spot on.

                2. Michael Wojcik Silver badge

                  Oh ferchristssake. Doctors. Already. Do. This. For. Transcripts. Prepared. By. Service. Firms.

                  Either they're already diligent, or they're already not.

                  There's nothing particularly new going on here. This is a nothingburger with LLM cheese.

                  I am on record as being skeptical about LLM architecture (at least deep unidirectional transformer stacks with simple rectification), applications, and particularly the hype around them. In this case, though, it's a pretty innocuous application that does not alter the existing situation dramatically. An LLM seems computationally and energetically inefficient for the purpose, but then humans are pretty inefficient too (over our whole lifecycle) for use cases that can be automated.

              2. Eclectic Man Silver badge

                The article states: "If a patient consents to having their meeting recorded and transcribed, the audio recording is passed to Amazon's AWS Transcribe Medical cloud service, which converts the speech to text. The transcript – along with data from the patient's medical records, including recent test results – is passed to an ML model that produces notes summarizing important information gathered in the consultation."

                Although, as you state, GPT-4 is getting the transcript, not the audio, the transcript is produced by AWS Transcribe Medical, from the audio recording, which does need to understand accents, GPT-4 will still need to understand euphemisms, etc.

                GPT-4 is expected to summarise important information from the consultation, so this is clearly a medical assessment of "important information", although not a diagnosis

                I guess that either the physician has to deal with any images taken during the consultation, or write the whole thing up themself.

              3. Roland6 Silver badge

                Understatement “ The use of GPT4 given here is *not* for any diagnostic purpose.”

                The purpose of medical records is for diagnostics.

                However, given the focus seems to be on “more appointments”, suspect accuracy of the records is of secondary importance.

                It bothers me that there seems to be a potentially substantive lag between audio being submitted and the AI generated consultantion notes. In my recent interactions with a doctor/consultant, the notes were prepared in front of me and read back to get my agreement, before the consultation was formally ended. This process resulting in further clarification and information exchange.

                >” The article points out that the text is still being verified by the doctor”

                Can’t help but think of those “do you really want to do this dialog boxes” which users blindly click okay to…

          2. Twilight

            Good luck successfully suing a doctor for malpractice. I've recently been talking to a few friends in the medical and legal fields. In a lot of states, it's very hard to win a malpractice case for anything except the worst gross negligence.

    3. Roland6 Silver badge

      The other aspect of writing notes is the learning cycle; in writing the notes (as opposed to simply producing a transcript) a person is having assemble their thoughts into a coherent form that supports the conclusion and chosen course of action. Through this activity the practitioner learns and builds their knowledge.

  2. Cybersaber

    Nope! Just... NOPE!

    Oof, this sounds dangerous. Not because of LLMs and their problems such as hallucination. Human transcriptions can have lies, mistakes, and hallucinations too, but medical doctors are trained to understand how human minds work. I can tell you from professional experience that the majority of doctors are NOT experts on IT.

    I just hope the doctors that use this are required to 'sign off' on the transcription as if they themselves had written it, and not be allowed to 'blame the tool' when someone inevitably gets harmed by a glitch in the model.

    1. that one in the corner Silver badge

      Re: Nope! Just... NOPE!

      > I just hope the doctors that use this are required to 'sign off' ...

      From the article:

      "Generative AI models aren't perfect, and often produce errors. Physicians therefore need to verify the AI-generated text."

      So - yes, they are.

      For the time being, at least. Once everyone becomes complacent..

      1. Intractable Potsherd Silver badge

        Re: Nope! Just... NOPE!

        "Sign-off" has a tendency to rapidly become "rubber stamp", and the more accurate the transcriber (whether human or computer) becomes, the less likely the transcript is to be read properly.*

        Also, the only way the doctor is going to know if the notes are correct is to keep contemporaneous notes her/himself to refer back to...

        *Especially if it gets more [paying] patients through the door.

    2. Anonymous Coward
      Anonymous Coward

      Re: Nope! Just... NOPE!

      In the UK, this would need a change in the law and national protocols.

      A medical record is a contemporaneously legal document of treatment owned by the accredited healthcare professional and their GDPR compliant NHS Trust…not AWS as a Data Sub-Processor (lack of informed consent methinks here).

      Even the professionals can’t agree on the right criteria for the current Spring 2023 COVID Booster and the NHS ‘IT’ has sent out tens of thousands of incorrect text/mails due to previous patient coding not being reset. Wasted thousands of hours of patients and staff time.

      … and no doubt when it closes at end of June tens of thousands of Vax doses will be dumped as expired with people actively wanting it being turned away.

  3. Dante Alighieri


    Several points.

    I use VR for this purpose daily. Like all my colleagues it is full of strange typos. It is about the psychology of reading/dictation - you read what you said and it is *really* hard to pick up errors in your own transcribed output.

    It is not just summarising the audio - it is pulling past medical history, other interesting facts and results into a report from the article.

    Clinic summaries are not just a sanitised record of the consultation. There is also a synthesis of information into a diagnosis, problem lists and treatment plans.

    No mention as to who or where that is happening. If only check and sign it hints at trying to be Watson.

    Other wise there is a missing step of dictate outcomes.

  4. Bump in the night

    What a boon!

    That should save 12 whole minutes out of the 2 hours I sit waiting in the doctor's office

  5. GoodStuff


    Reminds me of a story my wife brought home from work (at a hospital) some years ago, after they'd started using an offshore transcription service. The patient had endured a "below knee" amputation ... but ended up in the notes as having lost their sausage. Pretty sure current AI technology would have produced a better outcome for the poor chap.

    1. Anonymous Coward
      Anonymous Coward

      Re: Baloney

      My brother and his wife both work for the NHS and are a regular source of medical humour. (Anon because I have posted enough here in the past that anyone so inclined could maaaaybe piece it together and potentially ID my brother)

      The latest one he told me concerns a nurse colleague who had to make a home visit to a patient who - unrecorded in the notes she was given - had recently had a double below-the-hip amputation. She arrived at his home and did a double-take, and was so flustered at this unexpected complication to her carefully-prepared script that she went all to pieces. “Don’t stare at his amputations. Don’t stare at his amputations.”… couldn’t tear her eyes away, of course. Anyway, things went from bad to worse as she ploughed dutifully through her patient-visit checklist.



      ”Um, nurse, do you mean before or after my amputations? I was 5’11” but now I’m more like 4’2”. ”

      …and without being able to stop herself, her mouth running on autopilot to the complete and helpless horror of her brain, the fatal words came out…

      “Oh. So you’ve lost a couple of feet then…”

      Cue one of those awkward silences.

  6. Anonymous Coward
    Anonymous Coward

    “About your Martian flu, Dave…”

    “I’m afraid I’m going to have to disinfect you with this flamethrower”

  7. Andy3

    How long before patients are being prescribed drugs for conditions they have never reported, never suffered from? 'What's this stuff for Doc?' the patient might ask, 'I dunno, but the AI said you need it for your complaint, so there it is'. Err, but Doc I never said anything about a new condition, I only came to see you to have these stitches removed'. 'Oh well, you'd better take them or I'll be in trouble. Just chuck them away if you don't want them, but don't tell anyone or I'll be for the high-jump...'. Tell you what, I'll remove them from your next prescription....oh wait it's warning me not to tamper with its recommendations. Hell fire, it's reported me to the GMC!

  8. sketharaman

    "Carbon Health claims 88 percent of the verbiage can be accepted without edits."

    As long as they know which 88%, what can go wrong, huh?

    H/T John Wanamaker: 'Half the money I spend on advertising is wasted; the trouble is I don't know which half.'

    1. Ken Hagan Gold badge

      On the bright side, that's sufficiently bad that no-one is going to get complacent about checking the notes.

      On the dark side, I bet the universe comes up with a "better idiot" soon and gives them a medical degree.

      Out of curiosity, is that 12% a proportion of the summary, or of what the summary should have been. I can imagine a system that reduces its verbiage rate simply by omitting material that ought to go in the notes but is "known" (by the AI) to be too hard to summarise.

      1. albegadeep

        "On the dark side, I bet the universe comes up with a "better idiot" soon and gives them a medical degree."

        Reminds me of an old and dark joke:

        What do you call a doctor who graduated last in their class?



        1. Eclectic Man Silver badge

          On physicians

          From 'The Art of Travel' by Francis Galton, pub 1872*

          The traveller who is sick, away from help, may console himself with the proverb that "though there is a great difference between a good physician and a bad one, there is very little between a good one and none at all."

          Probably meant to be a joke, but, also possibly true-ish in 1872.

          *Re-published in 1971, 1972, ISBN 0 7153 5139 7

          1. Michael Wojcik Silver badge

            Re: On physicians

            In 1872 you were very likely better off with the "none at all" option. Medical "doctors" were decent anatomists but essentially useless at actually improving anyone's outcomes, and often outright dangerous, until the germ theory of disease and need for antisepsis became widespread. Even then they were at best a crapshoot for quite a while.

        2. Snapper

          I got me three of them in quick succession.

          First the absolute travesty of a GP who couldn't tell the difference between the symptoms of Asthma (which I knew more about than her) and a genuine heart attack! Cue wasting months getting tested for Asthma despite my protestations.

          Then the 'highly qualified' surgeon who went in to give me one stent and caused so much damage I needed three. No change in symptoms afterwards re lack of breath, pains in chest, tiredness etc.

          THEN the 'highly qualified' surgeon and 'a colleague' give me four more stents in a different part of my heart because they'd missed the problems there.

          Symptoms still here, no change.

        3. Michael Wojcik Silver badge

          Shrug. That last-in-class doctor probably does a lot more good (or at any rate less harm) than those celebrity doctors, many of whom have impressive credentials, who spew nonsense on television.

  9. milliemoo83

    Please state...

    ...the nature of the medical emergency.

    1. David 132 Silver badge
      Thumb Up

      Re: Please state...

      Upvote for mentioning the emergency medical hologram. Robert Picardo nailed that character.

  10. ecofeco Silver badge

    There is softweare that already does this

    This is just extra steps and higher chance of inaccuracy.

  11. mr-slappy

    See More Patients?

    "Carbon Health said the tool produces consultation summaries in four minutes, compared to the 16 consumed by a flesh and blood doctor working alone. Clinics can therefore see more patients"

    I think you mean "spend less money and make bigger profits."

  12. pecan482

    In my opinion, I see a lot of Mal Practice insurance going way up, because of this AI, I don't think that this is good for Health Care, and I haven't seen anything about " HIPPA LAWS" interactions with this AI. Has the AI taken out the privacy of a person's health record?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like