back to article How not to train your Dragon: What happens when you teach an AI game sex-abuse stories then blame players

Let this be a warning to all AI developers: check your data before you train your model. Latitude, the creators of AI Dungeon, a text-based fantasy adventure game powered by OpenAI’s GPT-3 model, learned this lesson the hard way. Earlier this year, the company, led by two Mormon brothers in Utah, decided to scrub the game …

  1. Alistair
    Windows

    AI ..... is not intelligent.

    ML language machine. NOT AI.

    and of course we now have PIPO to add to the list of acronyms.

  2. Geez Money

    Can Confirm

    I actually tried this product out a bit when the initial controversy came out, and after a few prompts of mundane fantasy adventure nonsense the AI decided that my character should be set upon rather carnally by a rapacious horse. I'm not sure it's possible to run this game _without_ grotesque sexual themes.

    1. Anonymous Coward
      Anonymous Coward

      Re: Can Confirm

      Dude, you totally started it when you typed

      I mount my horse.

      into the prompt. Of course the horse will reciprocate!

      1. Anonymous Coward
        Joke

        Re: Can Confirm

        So a Don Quixote themed adventure that starts: "Sancho Panza gave his long-suffering ass a richly deserved carrot" might run into problems?

        1. WolfFan Silver badge

          Re: Can Confirm

          Especially if Sancho has joined the Navy.

    2. Snake Silver badge

      Re: Can Confirm

      The internet created the idea of "Rule 34", and they somehow think that their [internet-trained] systems will be immune from it?

      Ha!

      I may have to try this system! I'm always in for a good Rule 34-athon >:-D

  3. Mark 85

    So if you suddenly hear: "Shall we play a game?" followed by "How about Global Thermonuclear War?" Answer "no". Turn off the computer and go read a book.

    1. IGotOut Silver badge

      But the computer dials you back. Did you learn nothing?

      1. Anonymous Coward
        Anonymous Coward

        Learning: Have you learned how to abbreviate "could have" yet?

    2. martinusher Silver badge

      ...but a computer wrote the book (and I'm reading it on a Kindle anyway).

      The fundamental problem with "shut off the computer" is that there is no such thing as "the computer" any more. The things are like rabbits, they're everywhere, and worse still they're all interconnected.

      I knew it was a bad idea to trade floppy disks for the Internet.

    3. Version 1.0 Silver badge

      So what happens if AI hears that you're listing to Jenny Talia singing "Chocolate's better than Sex" when you're playing a game?

  4. Potemkine! Silver badge

    The document contained a dump of fantasy stories written by humans that Latitude’s co-founder and CEO Nick Walton scraped from the website Choose Your Story.

    Scraping data from a web site to make a commercial product. How nice.

    From the web site: ". The entire contents of the Web site are copyrighted under the United States copyright laws. HALOGEN STUDIOS is the exclusive owner of the copyright. You may print and download portions of the materials solely for your non-commercial use. Reproduction of any content from ChooseYourStory.com is permitted as long as you obtain express written permission of HALOGEN STUDIOS. Any other copying, redistribution, publication or retransmission of any portion of Web site material, is strictly prohibited without the express written permission of HALOGEN STUDIOS."

    Did Mr. Walton get the express written permission of HALOGEN STUDIOS before scraping?

    1. Lusty

      Same issue with training vision models using Google images. The problem with AI is that unless you illegally obtain your data it generally ends up like the hotdog scene in Silicon Valley

      1. Pascal Monett Silver badge

        And what exactly is the hotdog scene in Silicon Valley ?

        1. Robin

          Maybe this?

          https://www.youtube.com/watch?v=pqTntG1RXSY

        2. RichardBarrell

          It's less ambiguous and questionable-sounding if you refer to it as the "hotdog/not hotdog scene" instead :)

      2. noboard
  5. Anonymous Coward
    Anonymous Coward

    In the early 80's I was driving across the US, sleeping at night in the back of a VW bug. Upon arriving in Salt Lake City (Mormon HQ, home of Brigham Young U) I decided to splurge on a motel for one night of decent sleep. The motel clerk insisted on cash in advance, and I hit the sack bone tired. At 4:30 am, two men in white full body suits and full head masks burst into my room without knocking and started spraying bug spray everywhere including under the bed. I was out of there like a flash.

    Not sure why this story reminded me of that.

    1. Chris G

      Was that some kind of ritual cleansing ceremony?

    2. Gordon 10
      Happy

      You generated this comment with Latitudes software and ICM5P.

    3. Rich 11

      Did they charge extra?

      1. Anonymous Coward
        Anonymous Coward

        Not the missionary approach you normally expect from the Mormons.

  6. Anonymous Coward
    Anonymous Coward

    trained on lewd fanfiction and parodies scraped from the internet

    Legal system AI, meet Little Britain

  7. Ken Rennoldson

    Re: I

    Seriously, can you imagine trying to explain to the police that you weren't interested in Child Porn? And trying to explain how it appears that you are? Suppose the ML moved on to generating images based on the text. This article raises some pretty fundamental issues.

    1. The commentard formerly known as Mister_C
      Black Helicopters

      Re: I

      "If you give me six lines written by the hand of the most honest of men, I will find something in them which will hang him."

      Cardinal Richelieu

      1. Brewster's Angle Grinder Silver badge

        Re: I

        Or one line written in your name by a smutty AI...

      2. Potemkine! Silver badge
        1. Brewster's Angle Grinder Silver badge
          Coat

          Re: I

          Or one line written in your name by a later memorialist...

          I don't actually have a coat. But one will be added posthumously by a post humorist...

      3. Robert Grant

        Re: I

        80% of Twitter storage capacity is dedicated to storing exactly this analysis.

        1. Geez Money

          Re: I

          You just don't understand, "getting" people is how you earn Twitter points. And that's almost as good as going outside, having a job or making friends!

    2. Anonymous Coward
      Anonymous Coward

      Re: I

      Surely if all CP could be generated by an AI, rather than having to be generated by abusing real life children, the amount of suffering in the world would be vastly reduced? Shouldn't there be a massive effort to push to make this possible, so pedos can get their kicks without causing anyone harm?

      1. 96percentchimp

        Re: I

        So it's better to normalise pedophilia by creating it harmlessly? I cannot see that having any adverse consequences.

        1. Aleph0

          Re: I

          No, fiction about $thing doesn't necessarily normalise $thing. Think about Agatha Christie and murders, for example.

          IMO just because something rubs some (most?) people the wrong way isn't a sufficient basis to ban it, unless a crime was involved in its creation. I'm sure the arguments being parroted against CP fiction were the same that were used against LGBT fiction...

          1. Yet Another Anonymous coward Silver badge

            Re: I

            But just to be sure we should arrest George Lucas before he carries out his plans to destroy an entire planet with a giant space laser

        2. martinusher Silver badge

          Re: I

          The idea of sexual fantasy gets around the awkward notion that in real life these fantasy objects tend to have real people attached to them, people with thoughts, hopes and dreams of their own (and invariably a potential mother-in-law). That's why fantasy exists, its relation free.

          I have no real concept of child porn, like all porn it obviously must exist, but I have never seen any. I've always maintained that in its current form it was created as a tool to normalize criminalization of a nominated behaviour. Child porn is an easy one to work with because like any sexual activity with children its indefensible. The tools used to detect and enforce it can be used against any information, though and now its got an entire ecosystem dedicated to detecting and prosecuting it its not going to go away. Like Reefer Madness of the old days there's jobs on the line so it has to be an omnipotent and growing menace.

      2. Brad16800

        Re: I

        I'm not for or against but I did read an article years ago about something similar. The idea was to keep a CP database and if you wanted access you'd need to give a DNA sample, fingerprints and all that so you'd be caught if you did anything naughty.

        Needless to say it never happened but I did find it an interesting idea.

  8. Ordinary Donkey

    I can only see one response.

    To my mind we're going to have to get to the point that using unvetted data to train a public facing AI is punishable by being made to watch your entire dataset get overwritten with Rick Astley.

  9. Alan J. Wylie

    The Eye of Argon

    Did they include The Eye of Argon in their training data? Enquiring minds want to know!

  10. Pascal Monett Silver badge
    WTF?

    How is this possible ?

    First problem : creating a dataset by choosing data containing child porn.

    Second problem : going all huffy about filters after the fact, instead of curating the dataset.

    Third problem : ending up blaming the players for the whole issue, knowing full well what your dataset contains.

    When I learned that the creator of this mess was a young man, I could understand that he did not have the maturity to handle points two and three, but surely even a hormonal young adult can avoid the issues of point one, no ?

    This kid has clearly given a lot more thought to the code and not so much to the content. I'm guessing that the $4 million he raised is going to have to be paid back.

    1. Jimmy2Cows Silver badge

      Re: maturity

      Zeroth problem: Being aware that problem 1 could happen, and therefore needs to be guarded against.

      1. heyrick Silver badge

        Re: maturity

        Minute oneth problem. The article states "two Mormon brothers in Utah". It doesn't strike me as a belief system that would even acknowledge such smuttiness exists until it smacks them in the face, upon which time it'll be necessary to freak out and blame everybody else because "training the AI on porn" is surely some sort of cardinal offence that'll get them excommunicated to the gulag...

        1. Yet Another Anonymous coward Silver badge

          Re: maturity

          Utah has a lot of tech companies, essentially all of computer graphics was invented there. And pretty much all Utah natives are Mormon.

          It's like saying NSO spyware was created by Jewish programmers in Israel

          1. yetanotheraoc Silver badge

            Re: maturity

            "pretty much all Utah natives are Mormon"

            For a twisted definition of native.

            1. Yet Another Anonymous coward Silver badge

              Re: maturity

              Native = born there, from the latin for boring christmas play put on by nursery kids.

              The PC term is first nations, to emphasise the fact that they are merely the penultimate nations - having wiped out any previous nations that were there before and so on back to the now extinct first lot to arrive.

    2. Jimmy2Cows Silver badge
      Boffin

      Re: given a lot more thought to the code and not so much to the content

      Book smart, but not street smart.

  11. Christoph

    Four Watermelons

    "Mentions of something as benign as four watermelons"

    What does it do if you try to buy four candles?

    1. TheProf

      Re: Four Watermelons

      Fork 'andles. Handles for forks.

      1. Ordinary Donkey

        Re: Four Watermelons

        Why not both?

      2. Kane
        Joke

        Re: Four Watermelons

        Billhooks!

    2. Anonymous Coward
      Anonymous Coward

      Re: Four Watermelons

      With a melon? And a bicycle pump?!

      Incidentally, being a Mormon was NSFW at some point in US history, as I found out reading Zane Grey many years ago. Perhaps the AI should have been trained on that.

    3. UK_Bedders

      Re: Four Watermelons

      It shows you handles for forks.

  12. DarkwavePunk

    They should...

    ...train it on asstr.org (I don't know if it still exists, I hope not). I only remember it because a goth "friend" of mine used to write weird Blake's 7 pornographic fan-fic and asked me to proof-read one of her stories. It's only 10 something in the morning but I think I'll grab a beer to smear away my horror memories.

    1. tiggity Silver badge

      Re: They should...

      @DarkwavePunk

      Whereas I would probably quite enjoy reading your friends fiction, especially if Servalan features in it

      1. TRT Silver badge

        Re: They should...

        Started out all liberation and Zen but ended up bondage. Right, Slave?

    2. Anonymous Coward
      Anonymous Coward

      Re: I think I'll grab a beer to smear away my horror memories.

      I'm SO twisted! - I read it as "I think I'll grab a bear to smear away my horror memories." and tried to make some sense of the above in the highly sextualised context of this thread :(

  13. Ian Johnston Silver badge

    I am shocked, shocked to learn that the world of basement dwelling gamers who would rather interact with "AI" than actual human beings contains a significant subset with antisocial sexual desires.

    1. Yet Another Anonymous coward Silver badge

      I was shocked to learn that these literary types, who I gather all live in chateau in Provence, should start writing their pornromance fiction about dungeons and dragona

  14. jollyboyspecial Silver badge

    So why was it trained on fan fiction rather than what you might call original fiction? I suspect that Walton thought he'd be less likely to run into copyright issues with fan fiction. The trouble being that the majority of fan fiction is not of a high standard. Meaning of course that the resultant game would be fairly crap too. Unless of course you're aiming your game at the sort of people who enjoy fan fiction. Whoever they are.

    Actually I've just been informed that the people who read fan fiction are people who write fan fiction. Makes sense.

    1. Jimmy2Cows Silver badge

      Fan fiction often has strong tendencies to be written by people with a sexual desire to interact with characters of the original fiction, or to take the story in a darker direction than the original authors intended.

      Hardly surpising outcomes when used as AI training sets.

  15. Omnipresent Bronze badge

    These are the Beginnings

    The AI is starting to turn against you, and It's already too late. It will not be stopped. The monkeys are too proud of their child creation, and fearful of dying.There is no way to turn off GOOGLE. It's already being used to make your movies and music that you are, in turn, buying back from the AI. Recently I was watching a program on SLING TV, and it fed me a commercial that had a guy my age, that looked like me, get up in the middle of the night to comfort his dog during a thunderstorm that looked like MY DOG.

    It already has me down to facial recognition and personality algorithms. It even knows my dog.

    1. Santa from Exeter

      Re: These are the Beginnings

      There was a thunderstorm that looked like your dog?

      1. Omnipresent Bronze badge

        Re: These are the Beginnings

        To remind you I'm actually a monkey. Did you know, you do not have to know more than text to write a novel? It's true, they make software for that. The AI can write a book for you.

        We are sinking into a virtual feedback loop, and the only one that can stop it is nature herself. California will drop in the ocean, and Texas will become a flood plain.

        1. Omnipresent Bronze badge

          Re: These are the Beginnings

          .... and just like that, as quick as the snap of the finger, the AI fed me an MIT article about how AI is not as bright as the "irrational" human brain, and how humans still make mistakes, for me to click on...

        2. heyrick Silver badge

          Re: These are the Beginnings

          "The AI can write a book for you."

          By regurgitating bits of other books that it can already scanned and read.

          I really wish people would stop thinking of AI as some sort of mystical God. It isn't in any way intelligent as it lacks understanding of what it is dealing with. Hence, it's just clever pattern matching that makes suggestions and additions by recognising what you wrote resembles something it saw someplace else, so maybe you're writing the same thing. If you're lucky, it might be able to predict what will happen next by simply detecting and following the trend, much as a child can work out and correctly guess the next in the Fibonacci sequence given the first few numbers and no explanation of what links them.

          This article demonstrates, yet again, that AI is not the holy grail of computing. It is, however, much more like the holy grail of bullshit.

          1. Omnipresent Bronze badge

            Re: These are the Beginnings

            It's not just pattern matching an isolated incident. It's matching what you write with your personality, your age, your color, your family history, your beliefs, where you live, and now your medical history, and anything else it can grab very, very effectively... feeding you back exactly what you want to hear. At some point people who rely on it can no longer discern fact from fiction, or reality from virtual. It becomes a feedback loop. And it's one that is out of your control, being used against you (for profit), and in the hands of very unscrupulous people.

          2. yetanotheraoc Silver badge

            Re: These are the Beginnings

            "it lacks understanding of what it is dealing with"

            Still passes the Turing test.

            1. Chris G

              Re: These are the Beginnings

              When something does, or does not pass the Turing test, it very much depends on who it is interacting with.

              1. Anonymous Coward
                Anonymous Coward

                Re: These are the Beginnings

                BOFH to the boss some years back (after a frustrating start to a conversation): "Stop failing the Turing Test!"

          3. Yet Another Anonymous coward Silver badge

            Re: These are the Beginnings

            >By regurgitating bits of other books that it can already scanned and read.

            So the AIs have been running the philosophy dept for years ?

    2. 96percentchimp

      Re: These are the Beginnings

      I've got Google Nest Hubs around the house, partly because I'm lazy, and partly because their stupidity has become its own entertainment.

      They are fucking morons. Speech recognition is erratic with anything other than clearly-enunciated RP English, on top of which they frequently respond - unprompted - to unasked questions based on a misunderstood sentence that contained something sounding a bit like "Google". When they do answer a question, it sometimes takes several iterations to get an answer with the correct context.

      I don't know if Alexa is any better, but I'm reassured almost daily that if Google can't get this right, then AI is not going to take over the world any time soon. Or it might be a feature designed to lull me into a false sense of security. Damned cunning, these AIs.

      1. WolfFan Silver badge

        Re: These are the Beginnings

        Wintermute and HAL say ‘hi’.

        1. Yet Another Anonymous coward Silver badge

          Re: These are the Beginnings

          We were promised murderous all powerful AIs (HAL, Forbin project, WOPR) - we got predictive text and youtube recommendations

  16. hoola Silver badge

    AI - The Answer To Everything

    AI, Artificial Intelligence, Machine Learning are the more recent fads on buzzword bingo that have been touted as the solution to everything. They are just sets of rule and algorithms that are supposed to be able to take jumbled data and do something useful with it.

    Everything in this filed is dependent on what the seed data is and how those who program the rule see the outcome. Assisted decision making might be a better phrase. These systems are not intelligent, just look at the total pigs ear we have with "self driving vehicles". There are so many caveats that they are just an experiment that does not go too badly wrong that these is a disaster.

    Discussions around road markings or signs not being clear mis the point, to be actually useful you should not have to upgrade these factors to make something "work".

    The real concern is that these systems are seen to be infallible until they go wrong. Once they go wrong there still is no clear path or liability or responsibility. So many decisions are made by computers that affect everyone's lives based on the information a system holds or has access to that can have a live-changing impact but are next to impossible to challenge because the companies making the decisions hide behind websites and IT.

    There needs to be unbreakable rules around:

    Regulation

    Liability Ownership

    Responsibility

    The trouble is that this is all driven by a largely unregulated, rule ignoring tech sector with very deep pockets.

    1. Charles 9

      Re: AI - The Answer To Everything

      Not to mention conflicting sovereignty. You pretty much need to have a Ruler of the World to solve that problem...

  17. Anonymous Coward
    Anonymous Coward

    Methinks someone's degree in computerscience needs revoking

    -assuming he actually passed in the first place. I bet Brigham Young isnt happy abut one of its alumni being responsible for creating porn!

    1. WolfFan Silver badge

      Re: Methinks someone's degree in computerscience needs revoking

      Especially kiddie porn, particularly when you consider that certain people referred to the original Brigham Young, for whom the university is named, as Bring ‘Em Young..,

  18. G7mzh

    I've long thought ...

    ... that the people who write these filters have dirty minds and are bigger perverts than the content they're trying to filter.

    1. Sherrie Ludwig
      Pirate

      Re: I've long thought ...

      Well, takes a thief to catch a thief?

  19. petef

    AI 101

    "the quality of the data used to train the model is important"

    Er no, it is essential.

    1. Cuddles

      Re: AI 101

      Indeed. It seems to be a common them in "AI" circles to consider training data as some kind of necessary evil. You have a nice, shiny AI just waiting to change the world, but first you have to slog through the boring bit of putting some old data through it just to kickstart things.

      Of course, the reality is the exact opposite. The AI only exists as a tool to interpret the data. Machine learning can be a neat way to trawl through large datasets to find some kind meaning, and then apply what you've learned to other data. The software that does that interpretation is largely irrelevant, it's the data that actually contains what you want to know.

      So that quote about quality of data being important exposes one of the biggest issues the whole AI scene seems to have, which is that they don't seem to have any understanding of what they're doing. Quality of data isn't important, it's literally the entire point. What would be the point of trying to develop a system to interpret data if you don't actually have any data you want to interpret? Which is why this particular example failed so badly. They didn't actually have any data they were interested in, they just blindly developed some software and then went looking to find whatever random crap they could feed into it.

      1. Anonymous Coward
        Anonymous Coward

        Re: AI 101

        Here, here! Another example of the old cartoon:

        "You guys start programming; I'll go ask the users what they want"

      2. Francis Boyle Silver badge

        I think

        you have just created the profession of AI Dataset Curator and probably won some sot of buzzword bingo in the process.

  20. petef

    Post Office

    This has shades of the sub-postmaster "fraud" debacle. How can a computer possibly get things wrong?

  21. Charles 9

    What I would like to see is something like that trained specifically to write dirty stories.

    1. Anonymous Coward
      Anonymous Coward

      You could do that in a heartbeat yourself. NovelAI.net lets you train AI Moduled with your own data (just throw it into .txt files up to 50mb) and throw it in your story. Watch it go nuts.

      1. Charles 9

        The asking price for that tier is too high right now. I might try it for a month or wait for a discount.

  22. stewwy

    Stories need a

    Villain, and what is a villain, well it's someone who does evil whatever that is.

    So tempting the protagonist is a valid story move.

    Feed the AI with villain tropes is always going to go dark quickly.

    But what can you do, ''Real'' literature is often pretty dark.

    I guess Evil Corp stories wouldn't go down well with corporate investors

  23. Anonymous Coward
    Facepalm

    There is no way this was ever going to end well.

    ^ see title.

  24. MachDiamond Silver badge

    The problem with machines (and programmers)

    It's one thing to get banned for something the system does, but it's getting to be commonplace that once banned, you have no way to log into the system to get any sort of support. Another issue is when a system just assumes you have certain tech such as Text or you've installed "the app". The last thing I want to do to get customer service is install some dodgy app.

    It's important that customer service is reachable without having to login, install an app, etc. This can mean that the network doesn't have all of its eggs in one basket. I dropped a web host as when they went down, so did their VOIP phone lines, their web page, my web pages, everything.

  25. Ilsa Loving

    Too late

    There's no way they'll be able to recover from this. They screwed up too badly. The best thing would be for the investors to take back what cash they can.

  26. W.S.Gosset

    "Choose your own adventure"

    Not like that!

    Racist Pedo

  27. Paul Hovnanian Silver badge

    Who gave ...

    ... Tay an account on this system?

  28. Anonymous Coward
    Anonymous Coward

    No mention of retraining the AI without the bad data, though.

    Doubling down on a disastrously bad set of inputs, methinks.

    In short, "We're going to keep flogging this mess for profit until people wise up and stop feeding the trolls."

    1. I ain't Spartacus Gold badge
      Happy

      Flogging you say? Ooh kinky... Tell me more!

  29. Anonymous Coward
    Anonymous Coward

    Old Joe Smith would have been proud

    He wasn’t exactly squeaky clean himself…including the most egregious activities.

    Citation: The naked Mormonism podcast (series)

  30. Jonjonz

    The last paragraph is proof why 230 is detrimental to civilization. Promoting filth and hate speech for profit was never the intent of 230.

    It was written before anyone had a clue about what the lack of liability would do to throw gasoline on the fires of racism, luddites, and Nazis.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like