back to article Turns out humans are leading AI systems astray because we can't agree on labeling

Top datasets used to train AI models and benchmark how the technology has progressed over time are riddled with labeling errors, a study shows. Data is a vital resource in teaching machines how to complete specific tasks, whether that's identifying different species of plants or automatically generating captions. Most neural …

Page:

  1. Paul Crawford Silver badge

    What would happen if a self-driving car is trained on a dataset with frequent label errors that mislabel a three-way intersection as a four-way intersection? The answer: it might learn to drive off the road when it encounters three-way intersections.

    Clearly that is not intelligence at all. You have faulty software because you did not have a complete grasp of the programming of it. Some might even say a negligent approach as you assumed the Mechanical Turks provided valid data, and you did not verify it yourself.

    1. Version 1.0 Silver badge

      Mistaking a a three-way intersection for a four-way intersection sounds like drunk driving - perhaps we need to ban the coders writing the AI software, and those selecting the learning images, from drinking?

      1. EarthDog

        No, it has to be the companies.

        Programmers aren't professionals so there is no license to pull. Unlike engineers, doctors, nurses, etc. As such they have no liability. But the companies do.

        1. Anonymous Coward
          Anonymous Coward

          Re: No, it has to be the companies.

          Are you an Uber driver? Drivers are not workers so no health care insurance is needed or vacation pay, and no Uber liability when accidents happen.

      2. katrinab Silver badge
        Flame

        The people doing the training are the people doing the annoying recaptcha challenges.

        1. ortunk

          I try to sneak in two wrong amswers to every challenge

  2. Pascal Monett Silver badge
    FAIL

    An enlightened explanation of how we got ourselves into this mess

    A remuneration scheme where, if you don't choose "bucket", you don't get paid.

    And people think that "AI" is actually worth something.

    1. Chris G Silver badge

      Re: An enlightened explanation of how we got ourselves into this mess

      Based on the examples given in the article, there are no labeling errors, the fault lies with the simplistic software and it's lack of ability to encompass enough parameters. A human would have little problem in making sens of a a 'bucket of baseballs' encompassing both a number of baseballs contained in a bucket or a bucket containing a number of baseballs.

      The errors in these cases at least lie with the programming.

      1. Richard 12 Silver badge

        Re: An enlightened explanation of how we got ourselves into this mess

        Well, the specification.

        Expecting every image to have one, unambiguous label is just plain stupid.

        1. sketharaman

          An enlightened explanation of how we got ourselves into this mess

          +1. Precisely.

        2. John Brown (no body) Silver badge

          Re: An enlightened explanation of how we got ourselves into this mess

          "Expecting every image to have one, unambiguous label is just plain stupid."

          I came here to say exactly the same thing!

        3. Warm Braw Silver badge

          Re: An enlightened explanation of how we got ourselves into this mess

          There was a recent article in which the 'AI' model recognised the label as qualitatively the same as the thing being labelled.

          Perhaps these researchers could do with the odd philosopher on their teams.

          1. Tom 7 Silver badge

            Re: An enlightened explanation of how we got ourselves into this mess

            Or someone a bit older and more cynical than them, someone with large complicated systems experience? I think AI is going to be fantastic, once greedy 'entrepreneurs' and technically blind managers are removed from the equation. You cant see the wood for the trees with a cash flow obscuring your vision.

            We're no where near the tipping point in AI becoming really useful but we may get there if the above mentioned dont fuck it up with greed and exploitation. It can do some fun things and I have no doubt given time to develop properly will be really useful for humanity but too many greedy bastards or premature cockups may see the end of it in commerce but you can bet someone worse will keep plugging a way at it.

        4. David Glasgow

          Re: An enlightened explanation of how we got ourselves into this mess

          Expecting every image to have one, unambiguous label is just plain stupid.?

          It depends entirely on what your wife sent you to look for in the basement.

    2. Michael Wojcik Silver badge

      Re: An enlightened explanation of how we got ourselves into this mess

      So twenty people here (so far) think machine learning is defined by a labeling procedure with perverse incentives.

      I think we can discount the possibility of human intelligence, too. Certainly it's evidence for a paucity in critical thinking.

  3. Anonymous Coward
    Anonymous Coward

    Google Captcha

    I was asked to click on squares with Bicycle but show a woman with a shopping trolley. It wouldn't let me continue until I'd said the shopping trolley was a bicycle.

    1. Yet Another Anonymous coward Silver badge

      Re: Google Captcha

      Captcha - guessing what other users outside America think a parking meter looks like, multiplied by what a bunch of Uzbecks were told to click on for $1/day

      1. TDog

        Re: Google Captcha

        And then re-enforcing their model by having failures leading to you being classified as an AI:

        "Look - we are so good we caught X^X thousands of AI's yesterday alone".

        Or it might just be Chinese cultural imperialism - by reprogramming Uzbeks (and every one else) to meet the Real Middle Kingdom Perception of Reality (tm) as promulgated by the Real Middle Kingdom (please ignore those imposters in Taiwan and understand that Tibet was always part of the Real Middle Kingdom according to the Real Middle Kingdom protocols), and thus for a small loss in unnecessary population that chose voluntarily to not meet Real Middle Kingdom standards of integrity and mental coherence mediated by their own choices the PRC (interestingly that was also a set of USA walkie talkies which were generally described as pricks) is subtlety and quietly changing the perceptions of their future greatest allies. Carries on like this till the next Peoples Congress changes the rules (and that will be just after the Greek kalends)

        1. Anonymous Coward
          Anonymous Coward

          @TDog - Re: Google Captcha

          You should get some fresh air from time to time.

          1. EarthDog

            Re: @TDog - Google Captcha

            it was perfectly coherent.

            1. John Brown (no body) Silver badge

              Re: @TDog - Google Captcha

              Well, I guess all you dogs stick together. I thought it was AmanFromMars1 and had to check the poster name.

            2. jake Silver badge

              Re: @TDog - Google Captcha

              Nobody said anything about coherency.

            3. Michael Wojcik Silver badge

              Re: @TDog - Google Captcha

              "Coherent" is debatable, but it certainly wasn't well-written. Some authors (Faulkner, say) can get away with stringing one clause after another together into a breathless mega-sentence. TDog does not appear to rise to this level of prose skill.

              1. amanfromMars 1 Silver badge

                Re: Horses for Courses

                "Coherent" is debatable, but it certainly wasn't well-written. Some authors (Faulkner, say) can get away with stringing one clause after another together into a breathless mega-sentence. TDog does not appear to rise to this level of prose skill. ..... Michael Wojcik

                Would you find it strange to discover, Michael Wojcik, that whereas you shared that it certainly wasn't well-written, a few others who may indeed number many, find it to be excellently crafted with a level of prose skill it is a joy to read exists elsewhere too to highlight and share the views they witness there, here with us on El Reg.

                I ponder and wonder on whether such warrants a red flag warning and be classified as COSMIC Top Secret Pornographic Steganography given how easily it can be intelligently designed to totally deprave and sublimely corrupt. The perversion in doing that though, would be that it would make it even more attractive and universally popular making its pictures and products impossible to unilaterally command and exclusively control.

                1. jake Silver badge

                  Re: Horses for Courses

                  There is no debate about it at all! Coherent was extremely well written!

                  But then I have a thing for assembly language ...

    2. Anonymous Coward
      Anonymous Coward

      Re: Google Captcha

      And? By doing so, you just helped us crash SkyNet.

      On the other hand, we just need more lonly, bored, honest people...like I used to be. I used to do Captcha's for fun from 9pm until 3am every day(many years ago) . The more I did, the harder they became.

    3. DS999
      Mushroom

      This is all my fault

      I hate Google captcha with a passion, so when I'm asked to do it I deliberately spend a minute getting it wrong. If I REALLY want to continue on with the site that is requiring it I might then reluctantly decide to get it right, but usually I'll just abandon it after that.

      If enough people would deliberately pollute Google's results like that they would drop it, and stop trying to force people into teaching their AI for free. If their AI ends up not being able to tell a car from a bicycle my job is done!

      1. Cederic Silver badge

        Re: This is all my fault

        With you on that.

        Maybe they're not testing image recognition but instead training AI to identify when people are intentionally trying to trick AI. Seems about the only actual use they'll get from it.

      2. Tom 7 Silver badge

        Re: This is all my fault

        I've found quite a few that if I hadn't spent a couple of years in the US I would have failed due to them using US english that most of the UK over the age of 30 would probably not have a clue what they meant.

    4. Michael Wojcik Silver badge

      Re: Google Captcha

      Oh, ReCaptcha is complete crap. It exists only because Google's found a way to get lazy people who run sites like StackOverflow to get them free labor. It's one of the reasons I quit contributing to SO. (Another is the fact that the site is no longer usable unless you enable scripting, thanks to their use of crap from cookielaw.)

      1. Anonymous Coward
        Anonymous Coward

        Re: Google Captcha

        Also the irony that the crap from cookielaw crap "solution" to a privacy problem actually results in StackExchange telling yet another third-party site exactly which websites you are visiting. Duh!

        I have half a mind to whinge about it on the meta site to see if they'll do anything about it...

    5. Stuart Castle Silver badge

      Re: Google Captcha

      When responding to these captcha things, I regularly get asked to identify all the “Crosswalks”. Trouble is, being English, I don’t call them “Crosswalks”. I call them “Crossings”, sometimes adding the word “Zebra” before.

      This applies to a lot of sites, even uk ones.

      1. Graham Dawson Silver badge

        Re: Google Captcha

        And then it forces you to identify on-road warnings (like SLOW) as a cross-walk and fails if you don't comply. It's a weirdly demoralising thing to experience.

      2. wegie

        Re: Google Captcha

        @Stuart Castle "...Trouble is, being English, I don’t call them “Crosswalks”

        Not just "crosswalks". Now identify all the "fire hydrants"...um, well, there aren't any. There's a load of streetside hose connections of some kind, but no rectangular iron plates in the road or pavement labelled "FH". The complete cultural imperialism and cross-culture blindness is breathtaking. Just imagine a car trained on that on UK roads.

        1. Ken Moorhouse Silver badge

          Re: but no rectangular iron plates in the road or pavement labelled "FH"

          I don't think it registers with us Brits as much as in the States. Yes, we do have them and I remember having it explained to me as a kid what the two numbers on the yellow H sign stand for, but they are just not part of our everyday awareness, as they presumably are Stateside.

          https://commons.wikimedia.org/wiki/Category:Fire_hydrants_in_the_United_Kingdom

          1. Terry 6 Silver badge

            Re: but no rectangular iron plates in the road or pavement labelled "FH"

            Aren't there laws (state or federal) about parking in front of those big kerbside hydrants. They do seem to feature in films ("movies" as they call them there) quite a lot. People crashing in to them, making fountains with them and so on.

            In UK and other places they're just tucked away below ground.

        2. Blofeld's Cat Silver badge
          Devil

          Re: Google Captcha

          "... Just imagine a car trained on that on UK roads ..."

          "On the left? What do you mean 'they drive on the left'?"

        3. Tom 7 Silver badge

          Re: Google Captcha

          When you get one that asks you to click all buses and there's not a circuit board or chassis shown!

    6. xyz

      Re: Google Captcha

      Makes me want to say, stop the world, I want to get off

  4. jake Silver badge

    And of course people are bloody minded ...

    ... I know people who intentionally mislabel pictures that might be slurped, just to bugger up the data. Some because they are horrified by ihe concept of AI, some because they hate big business, some just for the lulz, and some for various other reasons, or combinations of reasons.

    Computers sometimes make mistakes. To really bugger it up requires a human.

    1. handle handle

      Re: And of course people are bloody minded ...

      Aye, but computers can make them a thousands times faster.

      1. EarthDog

        Re: And of course people are bloody minded ...

        And pollute a huge number of data streams

        1. jake Silver badge

          Re: And of course people are bloody minded ...

          Yes, computers are a force multiplier.

          (For clarity, please note that that word is not "multiplayer".)

          1. John Brown (no body) Silver badge

            Re: And of course people are bloody minded ...

            "Yes, computers are a force multiplier."

            Give me a lever long enough and a fulcrum on which to place it, and I shall move the world. - Archimedes

            I don't think anyone could disagree that computers have "moved the world" :-)

          2. Coen Dijkgraaf

            Re: And of course people are bloody minded ...

            > Yes, computers are a force multiplier.

            “To err is human but to really foul things up requires a computer.” —Paul Ehrlich , American scientist (1926- ).

            1. jake Silver badge
              Pint

              Re: And of course people are bloody minded ...

              When you have to explain the joke ... Ah, fageddaboudit.

              Have a beer :-)

      2. jake Silver badge

        Re: And of course people are bloody minded ...

        "Aye, but computers can make them a thousands times faster."

        But only if told to do so by a human. Computers can't think for themselves. They are not intelligent (Sorry, amfM ... but deep down even you know that this is true.)

        1. Tom 7 Silver badge

          Re: And of course people are bloody minded ...

          Most humans aren't! I think computers can be Intelligent, its only humans stopping them at the moment, much as they stop other humans.

    2. DS999

      Re: And of course people are bloody minded ...

      As I just confessed above before seeing this post lol!

      1. jake Silver badge
        Pint

        Re: And of course people are bloody minded ...

        :-)

        It's Friday. Have a beer.

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021