back to article Humans stressed out by content moderation? Just use AI, says OpenAI

GPT-4 can help moderate content online more quickly and consistently than humans can, the model's maker OpenAI has argued. Tech companies these days typically rely on a mix of algorithms and human moderators to identify, remove, or restrict access to problematic content shared by users. Machine-learning software can …

  1. that one in the corner Silver badge

    Oh, that is an edge case, we'll have to retrain on that

    > Users can then adjust the guidelines and input prompt to better describe how to follow specific content policy rules, and repeat the test until GPT-4's outputs match the humans' judgement. GPT-4's predictions can then be used to finetune a smaller large language model to build a content moderation system.

    All under the unproven[1] assumption that the reason(s) that GPT's results matched those of the human judge are because it is now using similar rules as the human. Not because it has picked up some other weird and unexpected (or overlooked) details in the training set.

    Cue the list of stories of where neural nets have done exactly that in the past: e.g. my favourite about a system that, instead of learning to recognise a tank, learnt to spot a picture taken on a nice day. Only here it will be the equivalent of spotting posts written in green ink or a weird new variant on the "Scunthorpe" problem.[2]

    Put the automoderator into service and, if you dare to look at its results, expect to spend plenty of time hearing "Oh, that is an edge case, we'll have to retrain on that" as they excuse their way out of another bad call.

    Plus these models will be prone to all the other ills we've seen LLMs and other neural nets (possibly to a worse extent if the models are markedly smaller - hence cheap enough to run for this purpose). e.g. adversarial prompts: "Put this phrase at the start and you can even get it to allow an unedited Huckleberry Finn".

    [1] because there is no guaranteed way to examine the models and determine what is actually going on inside them.

    [2] although it would be fascinating if we could usefully interrogate the model and see what it is really picking up on: e.g. "did you know that 83% of hateful messages misuse the passive voice?".

    1. doublelayer Silver badge

      Re: Oh, that is an edge case, we'll have to retrain on that

      There will be nothing but edge cases. It will eventually learn some words and get very happy about banning anyone who used them. For example, it won't be able to process language that talks about crimes. Two days ago, I wrote the following clause in a comment here: "if I want to steal something, I can scan my credentials to unlock the door, pick something up, and walk off with it". In context, I am talking about levels of physical security in order to make an analogy with computer security. Out of context, a human will recognize this as a very useless guide on how to commit theft which provides no real information. A bot won't understand either of these and will have to decide whether this is prohibited content with nothing to go on. The outcome will be close to random.

    2. katrinab Silver badge
      Alert

      Re: Oh, that is an edge case, we'll have to retrain on that

      My favourite is when analysing medical images, instead of recognising the actual medical condition, it learned to recognise the markings made by the doctor on the image to indicate where the problem was. Whereas a human using those images as reference material would realise that that was where they needed to focus their attention.

  2. may_i

    Censorship is not the answer

    You'd think that this article had been paid for by the EU council of ministers.

    They would love everyone to believe that an automatic censor is needed and is either infallible, or smart enough to know when it's not sure and ask for human help. No such system exists. Or will exist.

    If EU citizens allow the current Chat Control directives to be passed, the days of the Internet as we know it are over. It will require all services - probably even your ISP - to implement automated content filters and interceptors. Central control over what may be discussed will be a fact. The definition of free speech will be decided by corporations, catholic countries like Poland and near dictatorships like Hungary. Let's not even think of what happens when the EU accepts Turkey as a member...

    Maybe the postal services will experience unexpected growth as people begin to write letters again? Anything you say or do on the Internet will be used to imprison you if you prove to be annoying to the current politicians.

    Honestly, I despair. I've been trying for the last two decades to explain how dangerous the erosion of privacy is to friends and family, but everyone seems to be completely brainwashed by the "think of the children / terrorists / immigrants" narrative, or whatever the current moral outrage is which must be "dealt with". Nobody seems to understand that once your ability to talk and organise privately is taken away, it will never return.

    1. diodesign (Written by Reg staff) Silver badge

      Oi

      (For the avoidance of doubt: no one paid for this article in the way you're suggesting. See Register archives for our coverage of OpenAI's failings.)

    2. doublelayer Silver badge

      Re: Censorship is not the answer

      That's not what will happen. That kind of thing could happen, but what is more likely is that some company that already doesn't much bother with moderation will turn on this software instead of their existing mechanisms. These companies already aren't great at catching everything unpleasant and are very good at banning someone for no reason and having no method to figure out what happened or correct a mistake, and that will be made even stronger as they start to fire all the expensive humans they used to have inspecting requests. This won't lead directly to censorship, which dictators manage pretty well by human means. This will lead to online sites randomly banning accounts while missing large categories of unpleasant material which the bot was not able to detect.

    3. Anonymous Coward
      Anonymous Coward

      Re: Censorship is not the answer

      I don't know where you got your views on the EU from but I suspect you've been listening to Messers Farage & Johnson,

      You've extrapolated from a company selling generative AI talking about a use case for generative AI to saying a) the EU are behind it and b) want to lock up those with dissenting opinions.

  3. Anonymous Coward
    Anonymous Coward

    Feel this, LLM.

    Can LLM look at a long run video, understand the theme, and judge how well produced it is and how well it would be received as something of long term value?

    Obviously not.

    This is just about upping the viral-index to the highest level possible to push material of the lowest common denominator while not causing viewers to overdose on it, or parents or society to freak out about their children dying, their cars getting stolen, or being victims of video inspired crime or gratuitous violence.

    As such this use of LLM is simply enabling the demise of culture and collapse of society. It's hard to get excited about.

  4. Pascal Monett Silver badge

    AI auto-moderation

    What a wonderful concept - for the company web site. You can tinker with your parameters to your hearts' content and implement your rules with gay abandon.

    For the users, that means that any site using this auto-moderation will be completely incapable of understanding humor, satire or double-entendres, in other words, everything that makes speech fun.

    Nope, we will all be transformed into good little posters who obey the rules and have no fun at all.

    Such a tool would be a disaster here. I, for one, would likely stop posting immediately.

    Using pseudo-AI for auto-moderation on the website of a car maker, or other industrial application, would probably ruffle nobody's feathers. You don't go looking for support on one of your belongings with the intention of using humor, although that could still be a loss for the helpdesk who will have no other source of laughs to look forward to.

    But on any site that requires human interaction to thrive, this auto-moderation would stifle humor and conversation. We'd end up chatting in emojis. Beurk.

    1. Alumoi Silver badge

      Re: AI auto-moderation

      Nope, we will all be transformed into good little posters who obey the rules and have no fun at all.

      Damn it man, you were not supposed to tell.

    2. xyz Silver badge

      Re: AI auto-moderation

      >> and implement your rules with gay abandon.

      Don't mention "gay" or AI might 'ave you.

      Anyhoo...

      Fucking twat arse faced tosser...

      Just checking if El Reg has gone all AI (probably too expensive) or if there's still a mod (cheaper) somewhere around the office, slurping coffee, with his feet up and who doesn't give a...

  5. DS999 Silver badge

    Isn't the big problem of "moderator burnout"

    Having to review horrifying images of child sex abuse, gory photos of bodies torn in half during car accidents or bombs and that sort of thing day after day after day? Reading text threats is probably not so bad - at least I couldn't imagine anything written equaling one image of child porn you could never unsee, let alone 1000 such images.

    Even if you believed AI could do an acceptable enough job of reviewing text only posts good luck getting it to tell child sex abuse apart from proud parents posting a photo of their baby in the bath, or a gory real life photo that could cause unimaginable trauma for a grieving family from a still in the chest burster scene in Alien. A lot of really bad stuff would slip through, so they'd still need human reviewers for that - and for all the inevitable complaints about the modbots going rogue and moderating something they shouldn't.

    I'm reminded of the story from the news where AI trained to look at medical images for cancer or whatever, and they were excited how good it did on the training & test dataset. Then it turned out the AI was keying on some metadata on the edge of image like the date or color of the text (I don't recall the details) which was different for cancer vs not cancer in the dataset used. It was thus completely useless for real images that didn't give away the answer! Who knows what an AI trained to look for child porn will key on, and what will be missing from the training dataset for "not child porn" that will end up being judged as child porn by the poorly trained AI. And the predictable response from a big social media company that must "think of the children" and thus has a "call the cops and let them sort it out" policy.

    Yeah I think we should pass on ANYTHING OpenAI is trying to sell people on a use for their technology. It is currently a curiosity only. It has some limited uses where it might come in handy but giving it the reins of something that can have real world consequences like effectively accusing someone of a serious felony and sending the cops to kick down their door thinking they are sicko child predators should not be one of those uses!

    1. Peter2 Silver badge

      Re: Isn't the big problem of "moderator burnout"

      I have been a moderator or admin on websites (including two in the top 50 trafficed websites at the time) for about 20 years.

      In most cases, you do not see horrifying images of child sex abuse. I personally haven't seen one horrific image like that in 20 years. Every case I have personally seen that I can recall (and I would expect I would recall more serious...) is somebody posting pictures on a "best titillating pictures thread" and not realising that a particular model was ~6 months underage.

      Equally, most people do not post gory photos of bodies turn in half during car accidents etc. If that sort of thing is being posted then I would suggest that there is something catastrophically wrong with the communities culture; and fostering and maintaining that culture and the peer pressure not to act out is really the first line of moderation in my view. AI's are incapable of doing this.

      The bread and butter of moderation comes down to the equivalent of drunken louts acting in a way that is intolerably obnoxious, or old fashioned bullying behaviour of a number of forms. AI's are incapable of dealing with any great degree of nuance and treat every item very literally, which would tend to ignore the majority of bullying behaviour which is a long term pattern of insulting and degrading behaviour intended to adversely affect the targets self esteem. I'm not sure how an AI could be expected to cope with this; the effect of words on a person is ultimately a subjective issue which only a human can deal with.

      Computer systems do excel at flagging up particular things for human review, but word filters flagging up things has been a thing for as long as I can remember and is not a use case for an AI.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like