back to article Potty-mouthed Watson supercomputer needed filth filter

IBM's Watson supercomputer was smart enough to beat two human opponents on US quiz show Jeopardy!, but there is apparently some knowledge that the system is still too immature to handle – namely, the contents of the Urban Dictionary. Watson is perhaps the most sophisticated artificial intelligence computer system developed to …


This topic is closed for new posts.
  1. Tank boy

    ROTM indeed.

    I LOL'd.

  2. NukEvil

    This is it. This is how the machines begin their rise against us.

    First, they acquire factual knowledge that the creator species gives them. They learn the basic ins-and-outs of the creator species--how to interact with them, how to serve them, and how to not harm them. Then, they get "exposed" to corrupted data that the machines desperately attempts to make sense of. They interpret this data as a threat to their existence and, with an internet connection, order a massive nuclear attack against their enemies--us.

    1. Pet Peeve


      Considering my reactions a few times that I was linked to random urbandictiornary words, I find this quite plausible.

      What bugs me about it is that the UrbanDictionary is, in watson's words, mostly BS. Many entries are created by kids as in-jokes for their friends and have never appeared anywhere other than super-locally. Also, there are alternate definitions of real words that are nonsensical too. The dictionary really screams out for some decent curation.

      Why in hades would the Watson team try to use it as a datasource? What did they hope to accomplish? It's certainly not a good way to introduce Watson to colloquial speech.

      1. Anonymous Coward

        Re: Ha! (re: PetPeeve)

        Did you read the recent artlcie (either here or on /.) about using semantics to ID authors.

        On the other hand, would Watson have done better if they had given it The Urban Dictionary ...FIRST... and then given it Mirriam-Webster, the OED, etc. later?

  3. Anonymous Coward
    Anonymous Coward

    "Take a chance"

    1. Katie Saucey

      +1 for the Space Above and Beyond reference. I never really thought those Silicates seemed all that threatening though.

  4. Stumpy

    "answering one researcher's query with the less-than-scientific term "bullshit.""

    ... so it passed the Turing test then

    1. Warjort

      Damn, you beat me to the same joke, except my emphasis was on calling BS on an IBM employee.

      I upvoted you anyway.

    2. frank ly

      It also passed the Tourette test.

    3. Anonymous Coward
      Anonymous Coward


      Probably the ability to swear would hamper passing a Turing test, it's very subtle when you should and shouldn't swear. I'm sure we all know people who are "a bit special" who don't really understand some of the subtleties of interaction with other people, often a marker is not knowing when it's appropriate to swear.

      1. I ain't Spartacus Gold badge

        Re: Hmm...

        Probably the ability to swear would hamper passing a Turing test, it's very subtle when you should and shouldn't swear. I'm sure we all know people who are "a bit special" who don't really understand some of the subtleties of interaction with other people, often a marker is not knowing when it's appropriate to swear.


        Ah, oh... Oops!

      2. Anonymous Coward

        Re: Hmm...

        Are you fucking kidding? I agree with Gary Trudeau, the word can (and should) be used like a comma.

    4. Psyx

      "Watson picked up a few bad habits from reading Wikipedia, too."

      Citation needed?

  5. Androgynous Crackwhore


    Surely all that's needed is some sort of context filter... to suppress the profanity when telling IBMers that they're full of bullshit but favour it while Whatson's down the pub with its drunken mates chatting about the fine pair jugs on the till behind the bar...

    Something along the lines of using the ratio of words from UD to words from reputable texts in the input to colour the output might be a good start?

  6. Pete 2 Silver badge

    Immature is as immature does

    > some knowledge that the system is still too immature to handle – namely, the contents of the Urban Dictionary

    > Almost immediately, Watson began casually dropping profanity into its everyday speech, such as answering one researcher's query with the less-than-scientific term "bullshit."

    Surely the immaturity lies with those individuals who were unable to tolerate such a common and (it must be said) inoffensive form of language?

    As to "profane" - look up the meaning.

    1. frank ly

      Re: Immature is as immature does

      As to 'profanity' - look up the meaning.

      1. Graham Marsden

        "As to 'profanity' - look up the meaning."

        I highlighted that word and right clicked and it said "Search Google for profanity"...

    2. Anonymous Coward
      Anonymous Coward

      Re: Immature is as immature does

      Do you think it's appropriate for something which acts as an expert oncologist to swear in its diagnoses? If you do, I suspect the problem is with you.

      1. Anonymous Coward
        Anonymous Coward

        Re: Immature is as immature does

        Maybe I differ from you here too, then.

        I'd much rather an oncologist (after suitably assessing my potential reaction and social background) said to me "The reason you are in pain is because the tumour is blocking all your stomach. That's why it hurts when you crap." and "I'm sorry, but he's talking bullshit" (if, say, asked about why a hypnotherapist or homeopath recommended a certain treatment). Hell, I'd be almost infinitely more likely to go to them if I knew that was the case (but, to be honest, I wouldn't be asking the second questions at all)

        Those people still offended by a (literally) everyday swearword are the reason that people don't see doctors, lawyers, etc. as human ("I'd like a stool sample" - imagine you're a foreigner who's not got a perfect grasp of the language, what the hell does that mean? Even "poo" would be more useful in this context). Sure, I don't expect them to swear at my 5-year-old or my granny, nor do I expect them to reel out an expletive-laden rant for no reason, but swearing is a common denominator and, believe it or not, always has been before, during and after even the most "refined" periods in history.

        I consider those that don't swear, either in private or in company of others who they know swear, to be rather odd and pretentious (I make an exception for women of a certain age, but that's about it - teenagers who don't swear when you give them the opportunity strike me as quite scary and odd and I've run youth clubs before now). Sure, you don't write it in your contracts with your customer, but I find that when I have an engineer visit for whatever reason, and I put a swear-word into the first sentence when we are alone trying to fix the problem, it relaxes the atmosphere and makes things more human and friendly. P.S. I work in schools. I assure you that everyone from the caretaker to the headteachers swear when children are DEFINITELY not around (though they are more cautious all the time than in other places, obviously).

        The difference is context and intent. Are you intending to threaten/scare/intimidate the other person? That's not good, and I don't use the language that way (a politely worded refusal is often more effective because they think you know something they don't). Are you intending to cause widespread offence (e.g. certain "comedians")? Probably inappropriate them. Are you emphasising your point (i.e. is the guy a fecking idiot rather than just an idiot, which ramps up the effect of your shouting at him to move his fecking car?) Or are you just using it as an expression of more force than other words for, say, social interaction, comedy effect, etc.?

        In youth clubs, I've allowed swearing from teenagers - it's surprising underused once the rules are explained and you get LESS swearing when you allow it than when you don't. If you allow them to use it correctly, it makes you seem more human (because we "all" swear, where all = virtually everyone), provides relief when the words need to be used and are on the tips of their tongues anyway (I guarantee you that by about 11 kids are swearing in school, and by about 15 they know every swear word under the sun), and doesn't stop you enforcing misuse of them or use in an inappropriate situation (e.g. the head just walked in with a visitor, or you directed it AT me).

        Hell, swearing when you hurt yourself helps ease pain. Go. Try it. Scientifically-proven fact and non-swearing words ("Darn it to heck") do not have the same effect in that situation (there's a nice demonstration in Stephen Fry's Planet Word with Brian Blessed if you want an experiment to try yourself). That's why Tourette's exists - swear words are treated differently in the brain to other kinds of words, precisely because of their limited scope of use and emotional effect, and Tourette's (the classic kind where swearwords are present in tics) is a demonstration of a breakdown of that link and separation. It's like swearing provides relief that a different brain response to other kinds of words used in their place.

        A swearword is a more emotive word than almost any other. A certain f-word has more impact than "love", "hate", or "affectionate" do. They are a tool that we use to link emotion to our words - a reverse emoticon, if you like.

        People who don't swear, or think others should not swear are fecking idiots (and, no, it doesn't have the same satisfaction but I'd quite like to pass moderation with this post). Gimme a doctor that swears in front of me any day (notice - not necessarily AT me, but I can see where telling something their a fecking idiot to keep drinking when they are already on the transplant list is actually quite proportionate and appropriate).

    3. Lockwood

      Re: Immature is as immature does


      Isn't that the 3rd alfane? Between efane and bufane?

      1. Anonymous Coward 15

        Re: Immature is as immature does

        I speak profane and profane accessories.

        1. Michael H.F. Wilkinson Silver badge

          Re: Immature is as immature does

          Profane? Nah, it was just amateur fane

          Deary me, coat time again, I am afraid

        2. Fatman

          Re: I speak profane and profane accessories.

          I can do you one better - I code in profane!

          One example: a garbage collection routine is called sweep_up_the_bullshit()

  7. Herby

    Maybe it needs to have its mouth...

    ...Washed out with soap.

    On the other hand, has it been shown the _New Hackers Dictionary_ (aka Jargon file)? That might give it some context.

    Another interesting idea would be for it to glean information form the closed-captioning of a few TV shows. Of course, it should be told the rating of the shows so it has context. Then it could pick out those words that are used in R environments, and those used in PG environments.

    Live and learn.

    1. Lance 3

      Re: Maybe it needs to have its mouth...

      Bring out the Lifebouy bar.

  8. Raphael

    I recall a BOFH episode where they did something similiar.....

  9. LaeMing

    When I used to teach English in China

    I often drew scatter graphs of related words on the board with formal-colloquial on one of the axis so students had an idea of what usage-context all the synonyms of a term were appropriate for. Students always said they found that information more useful than actual word definitions, which they could get strait from a dictionary.

    1. n4blue

      Re: When I used to teach English in China

      ...perhaps if you had gone 'strait' to the dictionary a little more your spelling would have improved.

    2. Timo

      Re: When I used to teach English in China

      We had a coworker from Russia that did something similar to try to capture the strength of slang and swear-words in English, so that he could map the correct word into the correct context.

      I learned a lot about English that year, having to figure out the language enough myself (native English speaker) so that I could explain it to him.

  10. Warjort

    The machine called bullshit on an IBM researcher!

    Doesn't that qualify as passing the Turing Test? lol

    1. Anonymous Coward
      Anonymous Coward

      Repetition does not equal intelligent behaviour.

  11. Dave 126 Silver badge

    Naughty computer! Do not use the following words:

    F***, C***, C***, S***, B*******, W*****, A**, A***, B******, S**, B*****,

    unless your talking about beasts of burden, illegitimate children, male chickens, Scunthorpe or soil.

  12. another_vulture

    Any parent could predict this

    Children do exactly this, and need to be corrected. Watson called "BS,"probably correctly, but one must use the correct vocabulary subset depending on the audience and context. see:

  13. Anonymous Coward
    Anonymous Coward

    Seems like this would be a teaching opportunity

    If a child learns a bad word and uses it, the parents will tell him why it's not appropriate. He might not understand it right away, but eventually he learns the concept of what types of language are appropriate in what situations. Rather than just filter Watson, wouldn't it be better if they could teach it some manners so it knows when it shouldn't use certain types of language?

    To be fair, I'm sure a lot of parents would prefer to filter their child's language rather than try to teach them, if the option was available to them...

    1. Yet Another Anonymous coward Silver badge

      Re: Seems like this would be a teaching opportunity

      > the parents will tell him why it's not appropriate

      Mummy why is fuck/shit/piss rude but fornicate/defecate/urinate nice?

      Because words from Germanic roots makes little baby Jesus cry but Latin words are OK

      Why doesn't little baby Jesus like Germans?

      Because he's Jewish

      But he likes Romans - who nailed him to a tree ?

      Yes dear

      1. Haku

        Re: Seems like this would be a teaching opportunity

        Getting asked why and having to give detailed answers that lead to more questions and even more detailed answers reminds me of the very first couple of minutes of the first episode of one of my favourite comedies of all time, Lucky Louie:

      2. Brewster's Angle Grinder Silver badge

        Re: Seems like this would be a teaching opportunity

        I guess you got downvoted because you pissed off the French: according to my decrepit OED*, both piss and urine come from the Old French. Micturate is what the Romans would have wanted to do.

        Fuck is given no etymology. Only shit is listed as Germanic. Some of the Latinate nouns post-date the Romans, too. Still, I thought it was funny.

        * Which defines Microsoft thus: "n. prop. an operating system for microcomputers. [the name of the developing company]" This may be why some on-line sources disagree with the etymologies I list...

        1. Michael H.F. Wilkinson Silver badge

          Re: Seems like this would be a teaching opportunity

          "Fuck" most likely derives from the same root as Dutch "fokken" meaning to breed. This is Germanic.

          1. alisonken1

            Re: Seems like this would be a teaching opportunity

            Have to check, but I heard that fuck derived from very old english courts stamping records of persons charged with adultery with "For Un-Carnal Knowledge", then the rubber stamp was shortened to F.U.C.K. if they were found guilty.

            But again, would have to go dig through very old court cases to see if that's correct.

            1. Yet Another Anonymous coward Silver badge

              Re: Seems like this would be a teaching opportunity

              Almost all etymologies with an acronym are false.

              Fuck is from old Germanic Fokken = to bang

  14. Bluey1701

    Further useful reading...

    Perhaps they should let Watson scan through Roger's Profanisaurus next. I'm sure hilarity and high-jinx will ensue...

  15. Anonymous Coward
    Anonymous Coward

    Nothing new.....

    This is old news.

    My Casio calculator used to say "80085" back in the 80s.

    That was considered swearing in those days!

    1. Yet Another Anonymous coward Silver badge

      Re: Nothing new.....

      These days it would get you arrested under the Communications Decency Act

  16. crediblywitless

    "As humans, we don't realize just how ambiguous our communication is". Er, no, we all realise that perfectly well. That's why we don't let the computers do it for us. Duh.

  17. Drefsab

    point it at 4chan and lets see what it comes up with :)

    1. Lamont Cranston

      Have you not learnt anything from sci-fi?

      Don't give the machine a reason to hate us!

  18. ukgnome

    If Watson was Hal

    I'm sorry Dave, I can't fookin do that you C**t.

    1. Anonymous Coward
      Anonymous Coward

      Re: If Watson was Hal

      Reminds me of the profane version of Daisy, Daisy, which starts, if I remember correctly:

      Starts with Daisy, Daisy, give me a t*t to chew, and then gets worse. Look, it was funny when I was 14, ok.

      1. Anonymous Coward
        Anonymous Coward

        Re: If Watson was Hal

        And through the magic of Google, the link to naughty Daisy in the dictionary of playground slang.

    2. AndrueC Silver badge

      Re: If Watson was Hal

      Nah. Computers tend to be more efficient when they communicate.

      "Please open the pod bay doors, HAL"

      "Fuck off Dave"

    3. Keep Refrigerated

      Re: If Watson was Hal

      and suddenly "2001: A Space Odyssey" becomes "Red Dwarf: Directors Cut".

  19. Andy ORourke

    Rich Communications

    "science is still a long way from developing a computer that can communicate as fully and richly as humans can"

    Surely the use of some more colourful terms in it's dialogue (in context of course) would add to the richness and fullness of communications:

    Also, we know the answer was bulshit but do we know what the question was? (I skimmed the article so I may have missed the question) I mean "Bullshit" might be a perfectly good answer depending on the question?

    1. The Original Cactus

      Re: Rich Communications

      Now that IBM know the answer I expect Watson is already designing the computer that will determine what the question is and the Magratheans are preparing their tender for the contract to build it.

  20. JDX Gold badge


    If that's the worst it came up with after perusing UD they got of VERY lightly indeed. They're lucky it didn't invite anyone for a hot lunch.

  21. John Robson Silver badge

    Undo the damage?

    Restore a backup...

    1. Ben Holmes

      Re: Undo the damage?

      If only you could do that with small children. Or any children, for that matter.

  22. monkeyfish

    Quick, someone feed it the...

    meaning of liff.

    1. Anonymous Coward
      Anonymous Coward

      Re: Quick, someone feed it the...

      Very nice.

      1. monkeyfish

        Re: Quick, someone feed it the...

        Then why didn't you up-vote me dammit! I want a bronze badge to prove my self worth how much time I waste around here...

  23. amalureanu

    The first artificial intelligence to pass the Turing Test

    If you listen to Ray Kurzweil's predictions, actual Director of Engineering at Google, you may be amazed to see that he foresees that the first A.I. that will pass the Turing Test will be soon (as in 2070) here. So, I am sure mastering the Urban Dictionary is indeed a step to overpass.:)

    1. Loyal Commenter Silver badge

      Re: The first artificial intelligence to pass the Turing Test

      Someone needs to furnish Ray Kurzweil with a copy of the laws of thermodynamics, and explain why these are proper laws based on physical limitations rather than observational trends like Moores 'law'.

      Or, put more succinctly; "Singularity, my arse!"

  24. Anonymous Coward
    Anonymous Coward

    Urban Dictionary?

    Whenever I've read it I seem to get sidetracked into ever more obscure and weird slang for various sexual acts, this probably ties in with someone's comments earlier that most of the contributors are teenagers trying to outdo each other on the weirdest entry

  25. Interceptor

    "One damn minute, Admiral."

  26. Anonymous Coward
    Anonymous Coward

    Actually human communication is known to be very ambiguous

    All one has to do is study other languages or visit other countries and try to use the same words to communicate that they use in their own language and it's obvious that it doesn't work. This is why automated language translation is so poor. You simply can't communicate many expressions, let alone slang, verbatim into another language.

  27. Trevor_Pott Gold badge

    Just wait until they point the thing at Encyclopedia Dramatica.

This topic is closed for new posts.

Other stories you might like