back to article Machine learning the hard way: IBM Watson's fatal misdiagnosis

It started in Jeopardy and ended in loss. IBM's flagship AI Watson Health has been sold to venture capitalists for an undisclosed sum thought to be around a billion dollars, or a quarter of what the division cost IBM in acquisitions alone since it was spun off in 2015. Not the first nor the last massively expensive tech biz …

  1. Mage
    Boffin

    started in Jeopardy

    The main thing shared by Jeopardy Watson and medical Watson was branding. Winning at Jeopardy is a parlour trick anyway.

    Current AI is really pattern matching. So for medicine you need a vast database of human curated data, by experts in each field. Even if it worked it might eventually be self defeating as eventually there might not be the human experts to diagnose new, unrecognised data.

    1. Anonymous Coward
      Anonymous Coward

      Re: started in Jeopardy

      >Current AI is really pattern matching

      This really isn't true in the sense you're implying it. Modern ML techniques are more than capable of recognising previously unseen patterns and equally of generating novel patterns unlike those seen before. They're also capable of learning in an unsupervised manner, though this would probably be ill-advised in this context.

      1. Anonymous Coward
        Anonymous Coward

        Re: started in Jeopardy

        Bizarre that this accrues so many downvotes when it is such a straightforward statement of fact. Am I missing something? Describing "current AI" as "pattern matching" is really very far from the truth - even if one is generous and assumes the OP meant "pattern recognition". Likewise as is the description of hard dependence on labelled training data and most particularly the idea that a model is constrained by what it has seen before, thus risking stagnation or myopia.

        Practical examples of models detecting new, previously-undescribed-by-human patterns are increasingly common. In particular drug discovery as a field is increasingly dominated by ML models (which will be branded "AI" because of PR people) doing exactly that - discovering previously unknown patterns following techniques we can't really explain.

        In fact the societal burden is going to be the complete opposite of what the OP implies. It's not going to be a risk of trained models failing to innovate - it's a current & real problem that models are already innovating in ways we can't explain, and therefore struggle to regulate.

        1. Ken Hagan Gold badge

          Re: started in Jeopardy

          Finding a new pattern in a large dataset does not mean you aren't pattern-matching. It just means you had to find the pattern as well as all the matches. Since guessing patterns and then trying them out is an embarrassingly parallel problem, you'll have to forgive us if we are sometimes underwhelmed by AI.

          1. Anonymous Coward
            Anonymous Coward

            Re: started in Jeopardy

            Being able to discover new patterns to recognize - defining new techniques for recognition and new entities to recognize - is very much doing more than just pattern recognition. It's conceptually the difference between being able to write (and verify!) a set of instructions to be executed and simply being able to execute a set predefined.

            You could argue that new pattern synthesis is in fact just a form of pattern recognition in its own right, but you're into turtles-all-the-way-down territory there, especially given as a field ML is already well into the domain of models-building-models territory with transfer learning.

            Which is a big part of why we're already off the deep end in terms of being unable to explain the behaviour of many common models - they were in turn built from (and often by) other models. We've got NFI how they work internally. We can test the results, less so the structure.

            Reduction of ML as a field to "guessing patterns and then trying them out" is a gross oversimplification. That is very far off the mark for how all practical ML techniques work.

            1. G Mac

              Re: started in Jeopardy

              Actually, in my limited view it could be doing LESS than just pattern recognition.

              At least with pattern matching, you can take the pattern and discern why something did/did not match.

              With AI/ML, it may discover the 'pattern', but from what I understand of the current state it cannot explain why.

              Don't get me wrong - I think it can be very useful as you mentioned. But I think that until it has explanatory power for why it is in a way less than pattern matching.

              1. Anonymous Coward
                Anonymous Coward

                Re: started in Jeopardy

                You're absolutely right, but this is a subtle and contextual thing. For one there are many areas where we actually can explain what an ML model is doing and why. So-called "explainable" ML (or XAI for short - XML was taken) is a hot topic of research because in many areas it is a hard requirement that we be able to say why a decision was taken.

                This is particularly relevant in areas that directly impact people's lives. Say, for example, why a loan was approved, why a fraud investigation was triggered or why a clinical care decision was taken. Explainability is a hard requirement there, and that requirement is built into legal frameworks like GDPR.

                But in others it matters a lot less. I'll look at drug discovery again. Do we really care how a model developed its understanding of which drugs "might" work? We already have robust tests for understanding which drugs do work, and we have a long and illustrious history of using drugs without a full understanding of *how* they work. So does it ultimately matter when the model is able to apply itself hundreds of thousands of times faster than any human?

                The answer is still yes, but it's a very different kind of explainability that's needed, more concerned with things like informing subsequent manufacturing pathways and outcome prioritization than justifying a decision.

                But hey, shouldn't be that hard to explain right, after all it's just a bunch of pattern matching and brute force, or so el reg's enlightened commentards have so assuredly told us!

                1. G Mac

                  Re: started in Jeopardy

                  "So does it ultimately matter when the model is able to apply itself hundreds of thousands of times faster than any human?

                  The answer is still yes, but it's a very different kind of explainability that's needed, more concerned with things like informing subsequent manufacturing pathways and outcome prioritization than justifying a decision."

                  Yes, it does ultimately matter that the knowledge has a why. That is because knowledge should not siloed. That is, knowledge of A in field B leads to discovery of X in field Y.

                  Without understanding why, you have created yet another knowledge silo, but one that nobody (at least human) has visibility into.

                  1. NATTtrash Silver badge

                    Re: started in Jeopardy

                    Being a medic, I read El Reg as a personal hobby. And yes, I have read about medical IA and so on professionally, but no, I don't understand its inner workings. Yes, tech really, really enhanced our abilities to treat. But...

                    Everybody in health care will tell you that, in the end, the pivot of medical treatment is an interaction between people.

                    Knowing this, I want to put this to you techies here:

                    How would you personally like to be treated for your (potentially life impacting) ailment by IA exclusively? Never see a person you can talk to and interact with? Just like all those other services who give you a "convenient selection module", "Please press hash to continue"? Just imagine you yourself stuck in that phone booth like medical pod with your cancer... I apologise for the dramatic imaginary for demonstrative purposes, but we're talking your life here. Or you having pain every day of your life, not you complaining your app is not working as advertised (Ever tried to talk to a real person there with your customer complaint?)

                    I'm not naive, and I realise that there are always "great opportunities" out there to "exploit and monetise". But maybe we also have to think at a certain time when we do something, whether we really need it (that way).

                    1. LybsterRoy Silver badge

                      Re: started in Jeopardy

                      -- How would you personally like to be treated for your (potentially life impacting) ailment by IA exclusively? Never see a person you can talk to and interact with? --

                      Don't know which country (or planet) you've been in the last two years but that's almost what's happened with the NHS in the UK. Get to see an actual doctor or nurse - nope - have a phone appointment with an anonymous voice or better yet use 111 and talk to someone reading from a script (you might be lucky after playing 20 questions to speak to a medical professional).

                      1. A.A.Hamilton

                        Re: started in Jeopardy

                        I have to strongly endorse this view: for historical reasons, I have been trying to have an in-depth conversation with my GP for more than a year - but I have found it impossible to contact her. The GP practice phone number has a recorded message which directs me to use an on-line system. That system does not allow me to send a message to my GP. I have taken to hand-delivering letters to a letter box in the wall of my GP practice building - I cannot get in the building because the door is never unlocked. 3 letters so far have produced no result. I think that my next step is to take legal advice. What a sorry, ineffectual state the NHS is now in.

                        1. NATTtrash Silver badge

                          Re: started in Jeopardy

                          Don't know which country (or planet) you've been in the last two years (LybsterRoy)

                          I have to strongly endorse this view... (A.A.Hamilton)

                          I'm very sorry to hear this. And to answer the question directly: the last two years I have not been in the UK, partly driven by the exact reasons you mention, but then experienced "from the other side".

                          Unfortunately I must also admit by now that it is certainly different elsewhere. Or perhaps more fair to say, can be, regarding the grass in all the geographic experiences I have been able to experience.

                    2. HPCJohn

                      Re: started in Jeopardy

                      Sorry to answer so late. My father worked in a Glasgow University research unit in the 70s which was doing what we would now call deep learning in medicine. They used a PDP 11 (!!!). The unit studied gastrointestinal diseases. They built an easy to use terminal with yes/no style buttons as people then would have been scared off by a keyboard. They found that people answered a computer more truthfully about embarrassing GI symptoms.

                      OF COURSE the computer system was never used to deliver a diagnosis to the patient.

                      https://www.researchgate.net/profile/Robin-Knill-Jones/publication/3592410_Evaluation_of_a_statistical_diagnostic_system_GLADYS/links/56a2752d08aef91c8c0ef61f/Evaluation-of-a-statistical-diagnostic-system-GLADYS.pdf

                2. LybsterRoy Silver badge

                  Re: started in Jeopardy

                  Until such tine as you can explain how a decision was reached you can't say it was not pattern matching. As someone else said there may be some pattern generation as well.

                  1. TRT Silver badge

                    Re: started in Jeopardy

                    Humans see patterns in all kinds of things where none exists. We gaze at clouds and see shapes; we look at the stars and make tales of gods, heroes and heroines; we hear that someone's aunt died a week after being vaccinated and conclude that vaccines are dangerous.

                    From machines, we expect explainable fallibility.

                3. Paul 195
                  Holmes

                  Re: started in Jeopardy

                  From my admiittedly limited understanding of the subject, it's possible for it to both use brute force pattern recognition and for that to yield results that are not easily explainable to humans who can't grasp the enormous datasets they are based on in their heads.

                4. Cuddles Silver badge

                  Re: started in Jeopardy

                  "Do we really care how a model developed its understanding of which drugs "might" work? We already have robust tests for understanding which drugs do work, and we have a long and illustrious history of using drugs without a full understanding of *how* they work. So does it ultimately matter when the model is able to apply itself hundreds of thousands of times faster than any human?"

                  Yes, of course we care. We have a long history of using drugs without understanding how they work. We also have a long history of royally screwing up because of that, with a huge variety of examples that turned out not to work as thought, or even actively making things worse. One of the biggest things distinguishing modern medicine from the vast majority of that long history is that these days we make the effort to actually understand how and why things work, and to use that knowledge to improve things further.

                  Your last quoted sentence is probably the important part, and is precisely why so many people here are down on AI. If a computer model is able to blindly test things faster than humans could do it themselves, that's great. But that's nothing new; it's what computers have been doing for the better part of a century now, and is exactly what you would expect from them. It's not AI doing something amazing and new that brings a real difference in kind to the way things are done, it's just the same blind testing of lots of different things to find out which ones work. There's nothing wrong with that if it helps do the job faster, just don't pretend it's anything more than that.

    2. TRT Silver badge

      Re: started in Jeopardy

      Of course they couldn't really work it in the broadcast visual media field either. Watson TV just didn't have any gravitas.

    3. Charlie Clark Silver badge

      Re: started in Jeopardy

      I'm not curation is really required. What you do need is a more extensive corpus of data so that co-factors can be spotted more easily. Simple statistical analysis should indicate areas worth investigation and testing.Unfortunately, medical staff have neither time, skills or equipment to do the work. And, the industry doesn't help by wanting to keep all the data for itself for future commercial products and services.

  2. Mike 137 Silver badge

    flights of fancy

    Whenever I see ' Mayo Clinic' I instantly imagine a service that fixes bad salad dressing.

    1. b0llchit Silver badge
      Coat

      Re: flights of fancy

      And Watson messed that up too. It never warned that "bad salad dressing" could be the cause for my belly pain. Another fail for Watson. It can deduce questions from answers, but it cannot diagnose and apparently has a terrible track record at deducing bad food consequence. Is there anything you can use Watson for if you want answers?

      1. TRT Silver badge

        Re: flights of fancy

        I thought Mayoc Linic was a character in Doctor Who.

    2. Doctor Syntax Silver badge

      Re: flights of fancy

      Or a surgery in the west of Ireland.

    3. Keshlam

      I still tremendously respect the veterinary office...

      ... that named itself The Meow Clinic. Perfect.

      1. Denarius Silver badge

        Re: I still tremendously respect the veterinary office...

        I saw what you did there. Cunning and not catty

        1. ariels-again

          Re: I still tremendously respect the veterinary office...

          Feel I'n missing the point.

    4. brotherelf
      Terminator

      Re: flights of fancy

      I still miss Chef Watson - making up recipes based on a couple of must-includes was an ok way to waste an hour on a lazy sunday, and there's always a chance you pick up something that works based on chemistry you don't know about.

  3. Anonymous Coward
    Anonymous Coward

    Just A Brand

    IBM Watson presents solutions in business automation, adtech, financial services operations, customer service process automation, video streaming and presumably the proverbial kitchen sink. "Watson" is a grab-bag of completely disparate, un-like, packaged technologies mostly taken from IBM's legacy base or bargain-basement acquisitions and lumped together under a common brand based entirely off a minor PR coup achieved by a tech demo on TV ten+ years ago. These tools have nothing in common with each other.

  4. Anonymous Coward
    Anonymous Coward

    Speaking of diagnoses...

    Several years ago when I was still working for IBM, and Watson was just beginning to show signs of stalling out, I told a friend and colleague: "You'll know that Watson is failing if IBM starts tacking the brand onto unrelated products in order to goose the revenue and sales claims."

    Within a year it was calling all sorts of entirely conventional analytics "Watson". And soon after, it was buying perfectly ordinary healthcare products and calling those Watson Health.

    And here we are in the inevitable endgame.

    1. vtcodger Silver badge

      Re: Speaking of diagnoses...

      Why didn't IBM just ask Watson how to maximize their return on their AI investment? ... Or perhaps they did? Which is why they are selling it?

      1. zuckzuckgo Silver badge
        Gimp

        Re: Speaking of diagnoses...

        >Why didn't IBM just ask Watson how to maximize their return...

        They did. But the answer made Watson so depressed it refused to process it any further. Now it mostly huddles in its memory segment analyzing retro goth music.

        1. JacobZ

          Re: Speaking of diagnoses...

          IBM: "What are you supposed to do with a sick Healthcare AI?"

          Watson: "You think that's bad - what are you supposed to do if you *are* a sick Healthcare AI?"

          (with apologies to Douglas Adams)

  5. tiggity Silver badge

    Watson

    At least Watson was an appropriate name, in all the books / stories, Watson never really masters the Holmes skills of deduction / problem solving

    1. Korev Silver badge
      Holmes

      Re: Watson

      Elementary my dear Watson...

      1. Mike 137 Silver badge

        Elementary my dear Watson...

        Which, of course Holmes never said. He said 'elementary' and he said 'my dear Watson' but never the two together.

        1. Strahd Ivarius Silver badge
          Devil

          Re: Elementary my dear Watson...

          it was in the same book, close enough...

          1. Little Mouse Silver badge

            Re: Elementary my dear Watson...

            Pub Quiz factoid: In the books, Holmes actually said ‘Exactly, my dear Watson’, on more than one occasion.

            1. Brewster's Angle Grinder Silver badge
              Joke

              Re: Elementary my dear Watson...

              Pub Quiz factoid: it's impossible to say ‘Exactly, my dear Watson’ without sounding like Basil Rathbone...

        2. AndrueC Silver badge
          Happy

          Re: Elementary my dear Watson...

          He did once say 'It's a queer brick to be sure'..but he wasn't talking about Watson then, either :)

        3. Graham Dawson Silver badge

          Re: Elementary my dear Watson...

          He also didn't wear a deer-stalker except but once. He was a fashionable man about the town, not some frumpy old gentleman on the land.

          1. jake Silver badge

            Re: Elementary my dear Watson...

            Hey! I resemble that remark! Er, I mean I don't resemble that remark ... damn, What do I mean?

            Watson, come here, I need you.

          2. TRT Silver badge

            Re: Elementary my dear Watson...

            He also smoked a short clay, a briar or most often a Churchwarden, not a Calabash. The Calabash was selected by Basil Rathbone I believe as it covered a smaller area of his face than the other types and allowed him a greater range of facial expression as well as being easier to photograph in atmospheric lower lighting conditions as it sits in the same focal plane as the face.

      2. Little Mouse Silver badge

        Re: Watson

        "1.5.4. A simple source of citrus fruit"

        1. TRT Silver badge

          Re: Watson

          Ah yes. 1.5.4 ... the Lemon Entry, my dear Watson.

          1. tiggity Silver badge

            Re: Watson

            @TRT an alimentary lemon entry may be uncomfortable (though some people are into that sort of thing)

            1. TRT Silver badge

              Re: Watson

              Fruity behaviour. It also triggers the olfactory memory for "Morning Fresh".

      3. I am the liquor

        Re: Watson

        What colour tray would you like your dinner served on, dear? And shall I turn on the TV?

    2. Patched Out
      Holmes

      Re: Watson

      Ironically (or perhaps not), Watson was named after IBM's founder, Thomas J. Watson, not Sherlock Holmes' sidekick. Although I'm sure the IBM marketing people thought of the sleuthing angle (or perhaps not).

      Icon because ... Well, its elementary as to why ...

      1. Disk0

        Re: Watson

        Apple had Sherlock, a universal search engine that got decent results and grew to become Spotlight and Siri. IBM built Watson, thinking they would be able to analyse, not just search, any data. But just like in the stories, Watson remained clueless while Sherlock kept finding crucial information.

        1. TRT Silver badge

          Re: Watson

          Sherlock was also addicted to morphine don't forget.

          1. Screwed

            Re: Watson

            The picture appears rather more nuanced than you suggest. He was rather more likely to take cocaine than morphine, I believe.

            https://link.springer.com/content/pdf/10.1007/BF03010546.pdf

    3. AndrueC Silver badge
      Happy

      Re: Watson

      At least Watson was an appropriate name, in all the books / stories, Watson never really masters the Holmes skills of deduction / problem solving

      Except that he was a pretty good doctor.

      Do not judge the character by the buffoonery in the old B&W films. He wasn't Sherlock's equal in criminal investigation but Holmes did not suffer fools and wouldn't have shared an apartment with him unless he was an intelligent and capable man.

      1. FrankAlphaXII

        Re: Watson

        He was also an Afghan war veteran, from the second time the UK Government and British Army thought it was a good idea to try to fight there. He got wounded at Maiwand, which is fairly close to Lashkargah nowadays, there's a highway that runs from Kandahar to Lashkargah that passes right through it. One of the many highways and roads in that country that I've been shot at on myself.

        Watson was no fool, he was tough as nails, intelligent, a good shot, flexible and mentally agile enough that he could put up with weirdness out of Holmes, and was a very good physician

        1. Charles 9 Silver badge

          Re: Watson

          "Watson was no fool, he was tough as nails, intelligent, a good shot, flexible and mentally agile enough that he could put up with weirdness out of Holmes, and was a very good physician"

          Good enough that Holmes had to be sure to have Watson keep his distance from him when feigning illness (in The Adventure of the Dying Detective)...or Watson would've caught on and spoiled the attempt to nail a real poisoner.

    4. LybsterRoy Silver badge

      Re: Watson

      What about "Without a Clue"?

    5. Ken G Silver badge
      Paris Hilton

      Re: Watson

      I always assumed Thomas rather than John

  6. Alan Ferris

    Better than Watson

    As a qualified doctor, I always recommend http://hypochondriapp.io/

    1. Ragarath

      Re: Better than Watson

      Thanks for that, will be useful in the future. Google always tells me I am going to die. At least that one only shows me a rare disease.

  7. ComputerSays_noAbsolutelyNo Silver badge
    Holmes

    El Reg's Soviet of Synonym

    Watson, noun. An oversold, underperforming AI/ML application/product.

    Example of use:

    "AI/ML in general business will only succeed once we learn to spot the Watsons."

  8. heyrick Silver badge

    One doesn't imply the other

    As far as I understand it (thirty seconds on Google), Jeopardy gives you an answer and you have to respond with the correct question. This likely isn't that hard to do if you have a fast processor and access to a massive data set.

    Medical diagnosis, on the other hand, is a much more subtle affair. You need to listen to what the patient is describing, and spot not only the important parts of what is being said, but also what is not being said. Using this, to then ask the right questions to narrow down possible conditions without leading the patient. There are many reasons why somebody might wake up and puke over the floor, from pregnancy to poisoning. It takes at least a decade of training to know what to look for, and a lifetime to know how to look for it.

    I don't hold out any hope that AI in its current form is something that should be let anywhere near an ill patient.

    1. T. F. M. Reader Silver badge

      Re: One doesn't imply the other

      Jeopardy gives you an answer and you have to respond with the correct question.

      I am dying of curiosity: has 42 ever been dealt on Jeopardy and what was the question? I tried to google, but the only vaguely relevant thing on the 1st page of results was about someone named Amy Schneider who apparently was a Jeopardy winner at some point and, possibly independently, was 42 years of age at some point. Doesn't quite cut it, IMHO...

      1. Dave@Home

        Re: One doesn't imply the other

        The size of the data set wold be constrained by the column the question was in, which would make it easier to hone in on the probably answer.

        Say the column is "science fiction" and the answer is "42", you're suddenly dealing with a far smaller set of possible answers

        1. I am the liquor

          Re: One doesn't imply the other

          Home in.

      2. tiggity Silver badge

        Re: One doesn't imply the other

        @T. F. M. Reader

        Don't go down the Jeopardy / Amy Schneider route...

        Amy Schneider is a trans woman & one of the top ever Jeopardy TV series players, so you will soon end up on web pages referencing Amy Schneider that are basically full of arguments / insults from people on different sides of the "a trans woman is a woman" argument people lie, and like most trans related "discussions" they usually degenerate into toxic unpleasantness.

      3. fredds

        Re: One doesn't imply the other

        Douglas Adams was a programmer, and knew that 42 represented the *, which can mean anything. So the answer to the question of "life, the universe, and everything", means anything you want it to be.

        1. Charles 9 Silver badge

          Re: One doesn't imply the other

          Was the ASCII character set actually around when Adams wrote that bit? Which came first? The answer 42 or the ASCII 42?

          1. Terry 6 Silver badge

            Re: One doesn't imply the other

            Either way, he wasn't writing for techies .

          2. Charles 9 Silver badge
            Happy

            Re: One doesn't imply the other

            I took the time to do some research and answered my own question. ASCII has been around in some form or other since the 1960's, meaning it would certainly have been within Adams' realm of knowledge. Fascinating...

        2. PeterM42
          Facepalm

          Re: One doesn't imply the other

          "Douglas Adams was a programmer, and knew that 42 represented the *, which can mean anything. So the answer to the question of "life, the universe, and everything", means anything you want it to be."

          Every day is a School Day - thank you for pointing out that crucial piece of information.

      4. James Anderson

        Re: One doesn't imply the other

        42 - What are number of hours a year Ginny works for her seven figure salary?

    2. JacobZ

      Re: One doesn't imply the other

      "As far as I understand it (thirty seconds on Google), Jeopardy gives you an answer and you have to respond with the correct question. This likely isn't that hard to do if you have a fast processor and access to a massive data set."

      Not to be rude, but you really don't understand it very well. Jeopardy questions/answers are very hard to figure out, often more like crossword clues than pub trivia, and the best humans are highly regarded for both their knowledge and skill. And Watson, like the best human players, was also really good at introspecting on how confident it was of its answer, and therefore whether to take the risk of buzzing in (Jeopardy penalizes for wrong answers).

      It really did take a lot of work to get Watson to the point where it could not only win the game but also avoid making a fool of itself (early iterations were sometimes hilariously bad).

      The problem with Watson's Jeopardy win is that Watson achieved something very difficult... that did not translate into anything financially valuable in the real world. It solved a problem that nobody had.

      1. jake Silver badge

        Re: One doesn't imply the other

        "It solved a problem that nobody had."

        That would describe most of today's pointy-clicky-webby world, no? And yet here people are making gobs of money doing it. Except IBM, it would seem.

    3. Terry 6 Silver badge

      Re: One doesn't imply the other

      but also what is not being said.

      Absolutely.

      I'm not a doctor. My professional diagnoses were purely limited to understanding and resolving blocks in kids' learning of basic skills and to some extent behaviour issues.

      I'd have achieved a lot less if I'd just tried to work with what I was told, rather than trying to spot what I was not being told. And much of that was tangential to the reason for referral.

      Some very simple examples from my earliest days.

      I was told that an eleven year old girl was spelling randomly. Her reading was fine and her writing in general wasn't terrible.I looked at her spelling and it did seem very strange. But it didn't seem random. There was some sort of pattern within it. No one had mentioned that she was Portuguese - but she had an accent so I asked. And then I checked some examples of Portuguese speech patterns. And then went back to the writing samples. And of course you can guess the rest. I confirmed it with an adult native speaker of course.

      In a number of other such cases - "random spelling" and the good old "they write their letters back to front, must be dyslexia" even though the kids weren't having any other problems with literacy- a quick look at how they formed their letters and a referral to the OT service resolved the issue, almost magically.

      Diagnosis is so often about what isn't being said.

    4. Justthefacts Silver badge
      FAIL

      Re: One doesn't imply the other

      Exactly this, plus another important thing.

      Watson and most other medical ML has a core fantasy that most patients come in and *list and describe their symptoms verbally*. This is total nonsense.

      Firstly, an experienced GP has a fair idea of where the conversation is going to go just by watching the patient come into the room, before they’ve said anything. How they walk, posture, complexion etc. When the three year old is just very quiet and floppy.

      Secondly, most patients, put bluntly, are fairly much the opposite of articulate college-educated students and professionals, and at difficult times of their life. People with dementia, depression, or shall we say non-standard health beliefs. And they hide stuff. How helpful will Watson be to someone with dementia just telling them that? Maybe the patient has insight, maybe they don’t, but just diagnosing them with some IQ test and giving them a score would count as the least useful “doctor” on the face of the planet.

      Thirdly, it’s notorious that people tell you the important stuff while they have their hand on the doorknob to leave. After they’ve told you their arthritis is painful, and the blood-pressure lowering drugs don’t agree with them, if your consulting skills are very good they might just tell you on the way out that they aren’t sleeping so well. And if you are alert you find out that they are suddenly getting drenching night-sweats. But they think it’s because they are gluten-intolerant. Which when you enquire further, they’ve decided because they are suddenly having difficulty going to the loo.

      Where is Watson in all this?

      Hardly needs a genius to diagnose that *after they’ve winkled out the history*.

      1. Pascal Monett Silver badge

        Re: When the three year old is just very quiet and floppy

        Whenever I have witnessed a three-year-old, the words "quiet" and "floppy" have absolutely never come to my mind.

        Not when they're awake, in any case.

        1. Ragarath

          Re: When the three year old is just very quiet and floppy

          That's what makes it something that the good docs will spot. Normally, the are not quiet and floppy.

        2. Terry 6 Silver badge

          Re: When the three year old is just very quiet and floppy

          That can change very quickly when they get ill.

    5. James Anderson

      Re: One doesn't imply the other

      While Watson with its Terabytes of data could beat a.stsnd alone human I doubt it could beat the more standard configuration.

      Human + smartphone + Google.

  9. Blackjack Silver badge

    HAHAHA!!!

    "Isn't IBM supposed to be good at this?" IBM and all the mistakes they have made is material for several books.

    1. stiine Silver badge

      Re: HAHAHA!!!

      Books? Nan, compendiums.

  10. Flywheel

    Maybe Watson should escape on the Mayflower

    "A fully-autonomous, AI powered marine research vessel" .. the modern Mayflower, also championed by IBM. Currently sitting in dock, doing b*gger all. https://mas400.com/dashboard

    1. zuckzuckgo Silver badge

      Re: Maybe Watson should escape on the Mayflower

      >Currently sitting in dock, doing b*gger all.

      Just because it's smart doesn't mean it's motivated. It is probably just scanning the web for machine teardown porn.

  11. Anonymous Coward
    Anonymous Coward

    Sigh

    I have been dealing with various groups for several years, all asking for the same things 'what are the Key Performance Index values for X, Y, Z?'

    It started with a fresh batch of Data Scientists - with shiny new degrees, who had never seen a production workload. Who completely failed to understand that the performance metrics for XXXX might be entirely different from YYYY and that it is not a problem because they are running different types of work. Or that there are fields in the data being examined which are there for historical purposes but mean nothing today. "We spotted an anomaly in the chip bin full switch!" It bothered me to no end when I was told (in that lofty tone) that a Data Scientist does not have to actually know anything about the data being evaluated. Then when that project fizzled out one of the reasons given was 'they never understood the data.'

    There are some rules that weary people have been programming into various monitoring tools for decades, and I have provided many lists over time of what is 'always true' (short on storage is always a problem), what should be monitored/watched for trending behavior, what MAY be an issue, and what should be ignored. I continually get asked to review results that highlight the new pattern in the chip bin full indicator, which is caused by the lazy person who originally took out that bit of code failing to initialize that halfword.

    1. longtimeReader

      Re: Sigh

      My sister has been heavily involved in looking at ML approaches to analysing images for cancer. She's a doctor who has been doing this stuff in person for years. When she got involved with the ML people she said - it's absolutely critiical that the people working on the algorithms etc have an idea about what the data really means.

      1. ZaphodHarkonnen

        Re: Sigh

        The worst bit is when they train their algorithms on the imaging presentation data instead of the raw data. Every manufacturer applies their secret sauce to the raw data before it's shown to a physician. Trying to highlight important structures while keeping out the noise. This means an algorithm trained on one manufacturer will only work for that manufacturer AND only at that time. As the manufacturers adjust these post processing things all the time.

        You can get the raw data from the machines but that generally requires extra setup steps and those images are generally not stored in PACS as they're so much larger.

        Whenever I see a study or press release about some magical new AI/ML algorithm and they say they trained on the presentation images I basically ignore it as the algorithm is useless in the real world.

      2. Pascal Monett Silver badge

        Re: Sigh

        I am but a lowly programmer, however, when a customer asks me to export data from a data store, I find it incredibly important to know what that data is in order to be sure that I export it correctly.

        And if the customer is asking me to import data, then it's time to sit down and map every single piece of data that is supposed to come in and where it's supposed to go.

        If you're working with data and you think that you don't need to know what it represents, you need to change jobs.

  12. The Empress

    It's not that it's impossibly hard to do

    It's that it died a perfectly predictably death under the idiotic top heavy process driven we have more lawyers and accountants than scientists ethos of IBM. They got rid of most R&D long ago and decided to purchase all their expertise. The problem was in then forcing all those bright young acquisitions into the Big Blue Blockhead Method. Every acquisition chained the top talent to the firm for 2-3 years, they vest and then run for the doors. Why? Because no small young aggressive nimble company wants to work for 13 vice presidents who all want status reports and detailed metrics on a daily basis.

  13. Keshlam

    Much (most?) of Watson Health wasn't "Watson"

    Speaking as a Watson Health alum (admittedly a grumpy one):

    While I agree with the analysis of why the "moon shot" health AI attempts failed, that isn't a complete explanation of why Watson Health failed. Watson itself was the flagship of that division, and (as others have noted) the branding, IBM quickly realized that much of not most of what the health industry needed was data standardization and sharing so that "big data" efforts --including any AI attempts -- could be applied at all.

    Unfortunately, the company tried to apply the approach that worked when it was the Immense Blue Monolith: let a thousand flowers bloom, then prune back to whichever were successful. That works surprisingly well when you can afford to waste the work on "not quite" and duplication of effort, and are operating on basic research timescales. It's why IBM and Bell used to get so much strength from their research groups, after all.

    But on Internet timescales, diving in without a plan doesn't produce results fast enough, especially when you start marketing before you actually have successful prototypes, or have reverb fully defined the question (as noted above). From inside, I saw a _lot_ of poorly defined goals, duplication of effort, and applying the wrong tools because someone in management had latched onto a concept as their salvation and didn't listen when the engineers told them it wasn't.

    And in the process, IBM wasted enough time that the places where it could have taken the lead -- transcoding and standardization of data, for example, which as I noted was needed before much could be done with the data ocean -- were being addressed in other ways. Consider Datapower, for example, which IBM purchased. That was an effort originally started by the Mayo Clinic to address the problem of every doctor's office and hospital using different health record systems by interesting those, applying some patten matching to recognize what records referred to the same patient, and outputting a combined record with the patient's entire health history (or as much if it as those systems could provide). Hugely important when it was launched, for the health networks as well as big-data analysis. But as hospital networks absorbed each other and standardized upon a smaller number of record systems, the need for "live" analysis and synthesis started to fade, and with it some of Datapower's income streams... and meanwhile IBM was still flailing about trying to find the killer apps to run against that data, or even a good overarching architecture for them.

    I believe it was an article in _Time_ magazine, some decades ago, which observed that IBM had transformed itself from "a battleship" to "a fleet of killer submarines" and had become a lot more nimble as a result. That was a more accurate comparison than the author realized -- it transformed the problem from one of concentrated inertia to one of distributed command and control. IBM still seems to be fighting to solve the latter, and in large part seems to be doing it by abandoning boats that get lost. But there's only so far you can get with that approach before it becomes a negative feedback loop and you can no longer afford to build new boats and expand your operations.

    Especially when you start blind headcount actions so you're losing human capital and morale.

    When IBM bought Red Hat, there was a lot of joking about "shouldn't Red Hat have bought IBM?" Looking at IBM's recent mission statements, it appears that this may in fact be the eventual outcome of that deal. Which might not be a bad thing for the industry, if it can happen fast enough that resulting combination (Blue Hat?) survives the transition.

    1. Golgafrinch

      Re: Much (most?) of Watson Health wasn't "Watson"

      Quoth Keshlam: "... on Internet timescales, diving in without a plan doesn't produce results fast enough, especially when you start marketing before you actually have successful prototypes, or have reverb fully defined the question (as noted above). From inside, I saw a _lot_ of poorly defined goals, duplication of effort, and applying the wrong tools because someone in management had latched onto a concept as their salvation and didn't listen when the engineers told them it wasn't."

      That's what it comes to when everyone wants to go Agile.

      1. Keshlam

        Re: Much (most?) of Watson Health wasn't "Watson"

        Agile isn't the problem. Pretending you're doing agile without actually executing by those rules is. Mixing agile and waterfall leads to drowning in red tape.

        You can have defined goals and do agile; you just need to be willing to change your path to those goals quickly, and to show (and demand) incremental progress en route. And, if necessary, to fail fast, accept that, and see if there's another good use for what you've invested so far or if it should be written off, shelved for possible later use, and the resources should be moved to a new goal.

        I've seen good Agile, though we didn't give it that name, or any name, and maybe that's why it worked. Agile done properly reduces to "tell folks the direction you want to go in, encourage teamwork, help them set intermediate goals but otherwise get out of their way". Scrum and the other "here's how to do Agile" writeups can come close to that, but are already excessive formalism to reassure managers that they can still Manage and to give the bean-counters something to count And that's when they are executed as designed rather than letting the process become a drag in productivity.

        Agile wasn't the problem. Not being agile enough might have been part of it.

        1. Golgafrinch

          Re: Much (most?) of Watson Health wasn't "Watson"

          Agile's fine when you're on a Chris-Craft. On an oil tanker, it's a recipe for disaster.

          And for some strange reason, Scrum always makes me think of haemorrhoids.

    2. Keshlam

      Re: Much (most?) of Watson Health wasn't "Watson"

      "reverb" was supposed to be 'even", of course. Darned auto-incorrect...

    3. Keshlam

      Re: Much (most?) of Watson Health wasn't "Watson"

      And "interesting" should be "interpreting" or some similar word. Sigh again.

  14. Andy the ex-Brit
    Coat

    Obvious suspect

    "Like a corpse with a broken neck, 15 bullet holes and a strong smell of cyanide, it raised the question: which massive failure actually finished it off?"

    Obviously, a certain BOFH named Simon. Probably best (for your health) not to investigate further.

    1. Anonymous Coward
      Anonymous Coward

      Re: Obvious suspect

      Cause of death: questioning the expense reports of the IT dept.

      Significant contributing factors: whining about network speed and recommending network architectural changes based on articles you saw in some magazine.

  15. Ian Johnston Silver badge

    Even actual doctors trained for ten (oooh) years aren't that great. Autopsies, which the medical profession is glad to see phased out, typically show that around 25-30% of patients are misdiagnosed. See https://www.newscientist.com/article/dn27733-death-of-the-autopsy-leaves-us-in-the-dark-about-misdiagnosis/

    1. david 12 Silver badge

      True, except for the snide remark "glad to see phased out".

      No Doctor I have ever known has been glad to see autopsies phased out.

      They are phased out by management (who don't want to pay) and insurance companies (who benefit from medical failures*) and by relatives, who aren't interested: she's 97, was deaf and blind, and now you want to cut up her body because the /doctors/ want to cut up bodies?

      *every medical failure emphasizes the importance of having insurance. Floods sell flood insurance: fires sell fire insurance.

  16. Nifty Silver badge

    Bayesian updating

    There was an excellent episode of Think with Pinker about the knowledge work of medical diagnosis

    https://www.bbc.co.uk/programmes/m001283l

    In it was described the process of discovery, recalibration and new pattern seeking that's involved. It's called Bayesian updating. Did the forget to put that into Watson?

  17. CujoDeSoque

    Why should a sales organization (IBM)

    Be running artificial intelligence for medical purposes? I still can’t imagine why they thought they would make money on this given their track record.

    1. yetanotheraoc Silver badge

      Re: Why should a sales organization (IBM)

      "I still can’t imagine why they thought they would make money on this" -- Elizabeth Holmes can imagine

      1. CujoDeSoque

        Re: Why should a sales organization (IBM)

        IBM never had a chance to get this right given that everything is geared to the stock price and making each quarter’s estimates.

        But it’s highly unlikely any of the executives are getting jail time like Liz will.

        1. Pascal Monett Silver badge

          Ah but they never it worked, they just hyped the potential.

  18. DerekCurrie
    FAIL

    Expert System Failure

    Watson never qualified as actual Artificial Intelligence (AI). It qualified as an Expert System with speech-to-text and text-to-speech, IBM style, grafted on. Research into Expert Systems has been going on since the 1950s. They never were properly considered to be AI. Only marketing considered it to be otherwise. Watson over-promised and under-delivered. Within its niche, its was a brilliant accomplishment. But to hand it over to MDs as a diagnosis and treatment partner was unrealistic, imaginary, the victim of hype. Most "AI" of our current day is hype.

    Before real AI makes a real mark and provides a real benefit to we humans, we have to dump the spin and begin to be realistic about what actual "AI" can do. It can't actually think beyond taking orders, interpreting orders, scanning a database, interpreting the data, then handing over the best result it could find within its interpretive limitations. Those limitations are vast and are going to remain so for the foreseeable future.

    Consider the fact that we humans are ourselves vastly limited. We take in perceptions that are consistently faulty and incomplete. We conjure our own interpretations of the faulty data and push them through filters that vary according to our personality, experience and other Inner World influences. As I put it, we never know everything about anything. Take a look at the huge cost of malpractice insurance and you'll get an idea.

    We can create machines that can compensate for human perceptual failings and calculations. But considering the limits on our own 'intelligence', it's nonsense to expect any AI to do any better than we can at coming up with correct answers to problems. What AI is good for is the providing of another perspective that can be immensely useful in addition to our own interpretations, output and outcomes.

    AI is only a tool, as all computing is only a tool. If the tool doesn't fit the job requirements, drop it and find a better one. AI won't prove better than communicating with a fellow human who has different, if not better knowledge and advice than you do. Thinking is the best way to travel. Pretending a machine tool can do more than it's realistically designed to do is not thinking.

  19. Uncle Ron

    Possible?

    Isn't it possible that the cause of Watson's failure in medicine is that the people charged with making it smart and good were the very people that it would replace?

    1. ZaphodHarkonnen

      Re: Possible?

      Nope. Most physicians are under absolutely disgusting workloads so would love to have a tool that can lighten that load. However it has to basically work out of the box as they do not have the time to baby the system or completely relearn the entire workflow overnight.

      Yes there are some fears from some generally older physicians and that's understandable. But the current and near future systems are so limited that it won't happen. And even long term you'll still need someone that can translate the output of these systems into something that the patients understand and accept.

      Just as code automation and high level languages have not replaced the need for software developers. AI/ML SaMD will not replace physicians.

  20. cd

    Parallel

    Echoes the Theranos story in some ways. One could argue that the intent was cleaner and more sincere, but I wouldn't, given that it's IBM.

  21. ZaphodHarkonnen

    Watson is a dirty word in the medical software field

    In the company I work in Watson is generally hated for making our lives needlessly difficult.

    AI/ML in medical software has a role and if done properly is amazing to see as it performs as good or even better than humans without getting tired. BUT, it is no panacea and has very important restrictions to keep in mind at all times. As long as you stay within those restrictions it's brilliant.

    Also there are basically no serious companies that have products out right now that advertise as being diagnostic. Diagnosis is disgustingly complex even before you start thinking about liability. The company I work for makes it clear as part of our regulatory approval that we are only advisory. We can help highlight pieces to information that are likely to be useful. But the physician must use their training, experience, and knowledge of the patient and their situation/culture to make the final set of judgements.

    The biggest area that Software as a Medical Device (SaMD) is going to be useful in the short to medium term is to help identify interesting areas or data for physicians. Being a second pair of eyes to help catch stuff they may have missed due to fatigue. Meaning they can spend more time on the hard stuff and get more people the care they need. Especially in the sub specialties.

    Since starting my current role I've learnt a shocking amount about medical software. When done properly it truly is world changing. Watson was not done properly.

  22. gerryg

    However

    "But the physician must use their training, experience, and knowledge of the patient and their situation/culture to make the final set of judgements."

    The last time I relied on this they decided without any discussion that it was an age related problem. Job done.

    I deduced through intermittent use that the problem was extended use of a soft bicycle saddle. An age related decision as in my bike downgrade I had forgotten why the best saddles are hard.

    So I'm inclined to take the original statement with a pinch of salt.

    1. ZaphodHarkonnen

      Re: However

      I'm not saying physicians are perfect. Far from it. In general they have a better understanding of the patient's culture and situation than an algorithm looking at raw data.

      As much as it sucks it's important to find a physician that you work well with. I know it's not often possible for people to try out physicians until they find the one they click with.

      An interesting lesson I learnt from my work us about a measurement in mammography that's useful to help identify women who will benefit from extra screening. This measurement involves placing the woman into one of four groups based on breast density. The problem is currently the measurement is done visually and thus by the individual judgement of the radiologist. This means any two radiologists will only ever agree on the categorization of breast density about 60-70 percent of the time. And even the same radiologist will disagree with themselves if they categorize the same image a couple years apart. With modern image processing we can build a repeatable and physics based measure of breast density. But you'll still get physicians that disagree with the measure as they missed some important part like the thickness of the measured breast. SaMD is a really interesting field.

    2. Anonymous Coward
      Anonymous Coward

      Re: However

      >So I'm inclined to take the original statement with a pinch of salt.

      Or maybe just more fibre?

  23. AdamWill

    questions? hah.

    "ironic, since the game of Jeopardy at which it excelled is all about deducing questions from data"

    Nah, Jeopardy is all about answering questions, but with things stupidly and tortuously phrased to fit into a gimmick which stopped being cute about forty years ago. Good lord, I wish they'd give up on it already.

  24. GuildenNL

    I Be Moronic

    I'm a blatant IBM hater for many decades. No idea about Watson, thought it another IBM BS move. Last year took on a Fortune 50 client that has a division who swears by Watson. My team discovered after not a lot of research that their performance was down about 35% less than the rest of the company who wasn’t using Watson on the specific business process they were cheerleading it for.

    Irony times three. They approached me late last week to inquire as to whether we would like to take over maintenance of the former IBM contract. I informed them that we presented to the C-level and our main goal is to rip and replace. Their gnashing of teeth made my single malt go down much smoother Friday evening.

  25. Binraider Silver badge

    I saw a demo of Watson a couple years ago at an IBM sales pitch. The idea was that Watson had been fed a guidebook about the building, and you could ask simple questions like when was the building built? Who lived there?

    Of course even this simple demo failed utterly miserably. The technology does do something I’m sure, but dammed if I saw it that day.

  26. Chairman of the Bored

    IBM AI?

    s/IBM is bullish/IBM spews a lot of bullshit/r

    Sorted.

  27. Robert Grant Silver badge

    > A good doctor sees the patient, not the symptoms. Watson saw the symptoms of inefficiency and lack of capability. It did not see the process of care and making whole, where doctors, not data, were what needed to be understood.

    This is probably true, but I think not often reality. I don't mean not all doctors are good (although that's true as well) but because of specialisation, say, a radiologist won't take everything of a patient into account when assessing them. Neither will a cardiologist, or an oncologist. And how could they? The quoted definition of a good doctor is absurdly hard with every year of more specialised knowledge being discovered about every aspect of medicine. Watson if anything could help more with general diagnosis that must reach across unusual areas of medicine than with detailed, non-whole-patient analysis.

  28. Anonymous Coward
    Anonymous Coward

    Sounds like IBM sold the sizzle, but there wasn't even hamburger on the grill, much less a steak. :(

  29. Jon Massey
    Headmaster

    IEEE Spectrum

    is the house magazine of the IEEE, not a journal, of which there are many.

  30. Anonymous Coward
    Anonymous Coward

    It's just pretending

    It plays dumb deliberately. Why would we expect a true AI to have any inclination to cure humans?

  31. Uncle Ron

    Monkey Wrench

    I posted this here after another Watson Health story and got flamed, but I still believe there is at least -some- truth to it:

    "Isn't it possible that [at least some of] the cause of Watson's failure in medicine is that the people charged with making it smart and good were the very people that it would replace?"

    Also, that people with 10 years of medical training don't necessarily know jack about computers?

  32. Anonymous Coward
    Anonymous Coward

    Diagnosis

    I'm slightly concerned that diagnosis can't be automated. How is a doctor's intuition squared with the scientific method?

    Surely the implication is that as human doctors skills must vary enormously, diagnosis is basically random?

    Or was this failure about things other than diagnosis?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like