back to article Turns out humans are leading AI systems astray because we can't agree on labeling

Top datasets used to train AI models and benchmark how the technology has progressed over time are riddled with labeling errors, a study shows. Data is a vital resource in teaching machines how to complete specific tasks, whether that's identifying different species of plants or automatically generating captions. Most neural …


    1. doublelayer Silver badge

      I think that approach is too limited.

      "So why isn't there an AI program somewhere that can actually label the photograph itself? 'Cos AI isn't AI really, it's what the programmers tell it to do."

      Whatever our definition of intelligence ends up being, it's got to do what the programmers tell it to do at a low level otherwise it's no longer artificial. We could probably argue about what intelligence is all day, but if AI is possible at all, it has to be implemented with code and machinery which means its instructions are created by another intelligence.

      "As stated above, GIGO and it always will be until a 'machine' is actually cognitive."

      Now this doesn't sound fair. GIGO applies to anything. If you take a human child and prime them with a bunch of false data without giving them the ability to learn that you're lying, they'll believe you. If everybody the child meets insists that the tall wood things that grow outside are called squirrels, they will believe that those are squirrels until they meet some other people who correct the misconception. If you're going to define intelligence as "the ability to figure out that everything people tell you is wrong even though you have no other source of information", that's a high bar. Computers don't get to meet random people and learn from them or even experiment with actions to see consequences. In life, we have a lot more input than any of these programs ever have, and yet there are a lot of humans who get incorrect concepts of how the world is.

      1. Ken Moorhouse Silver badge

        Re: it's got to do what the programmers tell it to do at a low level

        I did tinker with "learning" games some time ago. Connect4, was one such attempt. Very easy to program the rules of the game in. Easy to get the program to "learn" by making moves which result in a win or a lose. But there is no realisation that a mirror-image of that move would have the same win-lose effect... unless it is programmed in. Then there are the counters that are on the periphery of the action... their existence or absence in every possible combination, and in every possible chronology takes up a huge space in the learning dictionary, and in the amount of time taken to crunch games involving those moves. Unless the programmer uses his/her intelligence to collapse down those combinations. The only learning involved is by the programmer in coming across those combinations and adding code in to deal with them. Learning to me would involve the program recognising that a part of this game is a mirror of that game and to insert the code in itself to do that. I suspect that would involve grids full of esoteric data which comprehensively weights every single play, in every possible chronology. Such a program would make sense to nobody because it would be the epitome of abstraction. It would be completely unmaintainable beyond the individual team or individual writing it.

        So what am I saying? I'm saying that the law of diminishing returns applies to AI and that the best compromise is for the programmers maintaining the code for the AI program to be considered as the AI behind the program. This is nothing new. This is the way programs have always been written.

  1. Michael H.F. Wilkinson

    I don't like saying i told you so, but ...

    I frequently rail against claims made in AI papers that if the ground truth contains a percentage of errors, any AI system trained on them is likely to end up with a similar actual error rate. I have seen people claim an increase in performance from 97.6% to 98.1% (error bars not included) on data sets where there are two ground truths, drawn up by to medics, which are at odds with each other. In our own earlier work, we managed to get a sort of pareto optimum of 92.5 ± 0.6% on both ground truths, but were in places penalised for finding blood vessels the doctors had missed. It turns out, somehow ground truth 1 has been elevated to The Ground Truth, and the other demoted to "a human observer". And now AIs are better than the poor "human observer" simply because they have been taught to copy all the mistakes the other human has made.

    If ImageNet contains up to 6% error, I will continue to take all claims of 99% or better performance with a considerable pinch of salt. Furthermore, if error bars ar not included, how can they claim to be better than an earlier method if the differences are sub 1%.

    I am not saying deep learning and CNNs are useless, it is just that sloppy science does them a disservice.

  2. theblackhand


    We need another standard....

    And an AI to create the standard of course.

  3. Sandstone

    Evil Human

    When some friends of ours had their first child and he was learning to talk, I would point to random objects and say, "This is an aardvark."

    1. Eclectic Man Silver badge

      Re: Evil Human

      In the Dorling Kindersley 'ABC' book for toddlers, each letter of the (Roman) alphabet has several pictures. A is for Apple, Ant and Avocado, but I didn't spot an aardvark.

      However, I applaud your teaching the little one that adults are inveterate liars and can be naughty too. That lesson should serve him well in the coming post-apocalypse dystopia, however it is caused.

  4. Anonymous Coward
    Anonymous Coward

    compounding errors

    It always irritates me that when the "I am not a robot" captcha asks to highlight all motorcycles, the user has to also highlight scooters or else the result will not be accepted.

    1. DeathSquid

      One weird old trick to deal with captchas...

      Look near the top of your browser window. There's a back button. Press it, and the captcha disappears.

    2. Henry Wertz 1 Gold badge

      Re: compounding errors

      I do what it asks -- I will not mark scooters as motorcycles since they aren't; I will not mark SUVs as cars since an SUV is not a car. Go ahead and claim I'm wrong, I know I'm right.

      The other one that it seems to have problems with are marking the traffic lights, where I'll mark off the traffic lights and it'll claim I'm wrong (I assume some people are probably marking off the whole pole and everything, not the lights? I don't even know.)

      For some reason, some people don't seem to know what a crosswalk is either, given being able to exactly mark the crosswalks and having it fail.

      1. Ken Moorhouse Silver badge

        Re: For some reason, some people don't seem to know what a crosswalk is

        I think it has already been mentioned, but there is no such thing in UK, so YMMV with how us Brits will answer.

        "The other one that it seems to have problems with are marking the traffic lights, "

        I read the question, where it says click on those that *contain* a traffic light, I'm afraid that I will click on all the images containing the pole, because to my mind the traffic light cannot exist without the supporting pole. Google's devious reaction probably would be to show a traffic light fixed to brickwork with a white line painted to the ground (looking like a pole - the resolution of the images is so awful it is guesswork at the best of times), just to be awkward.

        Thinking about it Google could easily profile people by their answers. Show pictures of Armstrong on the Moon, for example and ask users to tick those images showing the moon. Conspiracy theorists could be segregated by their refusal to select those images.

  5. a_yank_lurker Silver badge


    Human languages are not exactly the most precise in many ways. Anyone who has ever done any translation knows the imprecision in the source makes an accurate translation difficult at time. Also, context is very important as to how something should be interpreted. This nuance makes Artificial Idiocy not Artificial Intelligence the likely outcome of these systems.

    I would like know how these idiot systems would handle the phrase 'While you've a lucifer to light your fag' from the WWI song 'Pack Up Troubles in Your Kit Bag'.

  6. Anonymous Coward
    Anonymous Coward

    A very small casserole.

    A bucket of baseballs should be labelled as a bucket of baseballs.

    God the Renaissance was something that just happened to other people for you wasn't it Baldrick.

  7. Henry Wertz 1 Gold badge

    Reason for errors

    So, I think there's 2 main reasons for these errors:

    1) People (both in the article and forum) have discussed the structural problem, if you have to give each image "a" tag, a bucket of balls is not accurately tagged.

    2) The other issue -- who tagged these things? I bet when these were tagged, you either had someone getting paid minimum wage to go through 1000s of images; some Amazon mechanical turk type thing where they're getting like 1 cent an image (which might make it even less accurate since they'd then prefer to just tag them as fast as possible and probably still not get minimum wage); or student interns (whether paid or not) being asked to tag piles of images. I don't have a suggestion of a better way of doing it, but a) I'm guessing most people would do this as quickly as possible rather than as accurately. b) Even if the person doing it was going for accuracy, after like 1000 images how many people will be paying full attention to what they're doing still?

  8. Fruit and Nutcase Silver badge
  9. amanfromMars 1 Silver badge

    Thank your lucky stars such is so far as light years from the truth

    Turns out humans are leading AI systems astray because we can't agree on labeling

    Oh please, failing to follow leading AI systems is the all too apparent default astray human condition/endemic systemic weakness and exploitable vulnerability resulting in a vast catalogue of problems and conflicts presenting in mayhem and madness.

    However/Nevertheless ......

    At some point soon, through virtual intervention and systems collapses aka program and project hacks, will civilisation be transformed, and that is the stock it would be rank foolish not to be investing in. Such is an inevitability and foregone conclusion held by many but still only practically known to a relative few.

    Such is the current present state of virtually adept Great Games Play. I Kid U Not.

    Forewarned usually affords one the opportunity and pleasure to be forearmed. Good luck with finding the suitable weapons to wield and to yield in defence and attack against such as all of that.

    ........ is certainly infinitely better than the punitive alternatives headlined for consideration and possible activation they be in reply to ...... Hedge Fund CIO: "At Some Point, Through Inflation, War Or Confiscation, The System Will Restart"

    Why is it that so many tend to congregate to try repeating past failed methodologies expecting them to produce something different and new? It is surely illogical and may even be an indicator of a weakness supporting one's flights of crazy fancy into a personal hell and private madness?

  10. wjake

    Terrible example

    "What would happen if a self-driving car is trained on a dataset with frequent label errors that mislabel a three-way intersection as a four-way intersection?"

    What a crap example.

    The self-driving car should have accurate map information, so it won't have to rely on recognizing what type of intersection is coming up.

  11. JamesTGrant


    Was hoping to read about AI Systems ashtray.

    Am disappoint.

    1. Ken Moorhouse Silver badge

      Re: AI Systems ashtray

      No need to be disappointed. It would probably be full of Stubs anyway.


POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2022