back to article Twitter's AI image-crop algo is biased towards people who look younger, skinnier, and whiter, bounty challenge at DEF CON reveals

Twitter’s image-cropping AI algorithm favors people who appear younger, thinner, and have fairer skin as well as those that are able-bodied. The saliency algorithm is used to automatically crop some pictures posted on the social media platform. It focuses on the most interesting parts of the image to catch people’s attention …

  1. Joe W Silver badge

    Leave out the older men?

    Yeah, I'm all for the bots cropping me out of the pictures. That way I can hide better when the enevitable happens and the killer bots come rolling through the streets.

    1. Allan George Dyer
      Terminator

      Re: Leave out the older men?

      But how are you going to repopulate and rebuild civilisation when all the survivors are old men in wheelchairs?

  2. JDPower666

    Here's a novel idea to get round these issues - just show the whole frigging image!

  3. Denarius Silver badge

    so ?

    it does what it says on tin, selecting more interesting parts of images. Marketting droids aim for maximum interest with lipservice to anything else. What would one expect ? Deep Woke where anyone of low melanin is excluded ?

    1. Anonymous Coward
      Pirate

      Re: so ?

      So your definition of 'more interesting people' is 'younger, thinner, probably female and whiter'. There are words for what you are.

      1. JDPower666

        Re: so ?

        Advertiser?

      2. cornetman Silver badge

        Re: so ?

        Quite possibly, "male", "white", "thin" and "young"? Sorry what was your point?

  4. Anonymous Coward
    Anonymous Coward

    AI imitates Real Life

    as ever.

  5. Draco
    Windows

    Is it the algorithm that's biased?

    The claim is that the cropping algorithm is biased because it tends to produce crops of "people that appear slim, young, of light or warm skin color and smooth skin texture, and with stereotypically feminine facial traits"

    I make the counter-claim that the cropping algorithm has no clue what it is cropping. I further posit that the cropping algorithm was not trained to recognize slimness, youth, skin tone, skin texture, or feminine facial traits, but rather was trained using more mundane metrics like: how long users paused at the image, how many likes the image had, how many shares the image had - since those are things Twitter can reliably measure.

    The bias, if any, is in Twitter's users who are more likely to pause at, like, and share images of young, warm and smooth skinned, feminine faced people - something that is easily treatable with a 10 year stay at a Xinjiang re-education camp.

    Alternatively, it could be that the "bias" is more reflective of the demographics or Twitter users and their in-group affinity compounded with evolutionary bias for markers for attractiveness (such as youth, and the general air of being healthy).

    Consider that the top 20 countries using Twitter could be claimed to have populations that are predominantly "light or warm skinned": (1) USA, (2) Japan, (3) UK, (4) Saudi Arabia, (5) Brazil, (6) Turkey, (7) India, (8) Indonesia, (9) Russia, (10) Mexico, (11) Spain, (12) France, (13) Canada, (14) Philippines, (15) Thailand, (16) Australia, (17) South Korea, (18) Germany, (19) Argentina, (20) Malaysia.

    (Source: https://blog.hootsuite.com/twitter-demographics/)

    As well, 62.6% of Twitter users are between 13 and 34 years old. (same source as above).

    I also posit, that people are more likely to tweet flattering images of themselves than unflattering ones.

    I don't deny there are, probably, people who post a picture of decrepit homeless men, passed out drunk in an alley, but that's not as likely to garner as much engagement as a healthy, young woman, laughing at a party.

    1. Anonymous Coward
      Boffin

      Re: Is it the algorithm that's biased?

      It matters how it was trained insofar as that probably would help the people who developed and trained it understand what they need to do to remove the bias it has acquired.

      But I think we can safely say that the person who showed that it is biased was smart enough to remove any skew in their data due to underlying image statistics (so, if 90% of pictures on twitter were of Britney Spears, they took that into account, say). And indeed, if you read the article, you'll find a description of the person's method:

      The winner, Bogdan Kulynych, a graduate student at the École polytechnique fédérale de Lausanne in Switzerland, [...] generated a series of fake faces and tweaked their appearances to test which ones were ranked highest in saliency scores by Twitter’s algorithm.

      So the data he presented to test the algorithm was not skewed by whatever the demographics of Twitter are. There is more on page for the GitHub repo. There is no serious doubt that the algorithm actually is biased.

      And that's not fine. It might be, for instance, that some company has noticed that 90% of its employees have single-syllable names. Whether or not they should then start actively encouraging the polysyllabically-named to apply for jobs and treating their applications preferentially is debatable (personally, I think they shouldn't do that). What is not debatable is that they should not therefore start actively discouraging applications from people with polysyllabic names. In particular if they train an algorithm to do part of the selection, they need to be careful that it does not learn, from its training set of existing employees or otherwise, that people with monosyllabic names are to be preferred.

      1. Draco
        Windows

        Re: Is it the algorithm that's biased?

        tfb wrote: So the data he presented to test the algorithm was not skewed by whatever the demographics of Twitter are.

        I'll accept the claim that the researcher presented a broad uniform (or reasonably uniform) sample range of facial data to Twitter's algorithm and he observed a non-uniform selection bias.

        But Twitter's algorithm is, likely, trained using data from Twitter users - whose demographic is skewed. Therefore, you would expect the algorithm to be skewed in favour of Twitter's predominant demographic. It would take a conscious effort to ensure an "unbiased" training set. But ... by the act of making it "unbiased", you are biasing it against the predominant demographic. Or, to put it in another way, you have to weight higher non-representative users of Twitter than representative users. (And we are not even accounting for general human preference for younger, healthier looking people).

        The likely problem is that Twitter's algorithm was trained using the proxies of engagement, likes, and retweets. The correct way to fix this is to actually train an algorithm for face cropping - which means it has to have some notion of what a face is. Unfortunately, no ML or AI has any clue what a face is (or any other object for that matter). See the following articles on how image classification can disrupted).

        https://www.theregister.com/2018/01/03/fooling_image_recognition_software/

        https://www.theregister.com/2017/11/06/mit_fooling_ai/

        https://www.theregister.com/2017/12/18/black_box_ai_attack/

        Anything you put on a face - glasses, glitter, paint, makeup, hat, mask, etc - can make the face classifier think it's a turtle, or toaster, or banana ... or something ....

        1. Anonymous Coward
          Anonymous Coward

          Re: Is it the algorithm that's biased?

          I agree with this, and I understand that ML systems aren't actually 'AI' in any reasonable sense, having worked in AI.

          I think it's completely clear why the algorithm does what it does given its likely training date. That doesn't make doing what it does OK. It also makes you wonder why the people who trained it didn't realise that what they were training it on would result in bias or, if they did realise, why they didn't care. Well, we probably know the answer to that part: little tiny brains.

      2. Cuddles

        Re: Is it the algorithm that's biased?

        "understand what they need to do to remove the bias it has acquired."

        The problem is not so much figuring out how to remove bias, but figuring out why they would want to do so. As already noted, the issue is likely not that there is any issue with the machine learning, but simply that the results reflect how users behave. Twitter's interest in using this system is to get more user eyeballs spending more time looking at Twitter. The system isn't suffering from bias, it's doing exactly the job intended.

        The only problem is that what people actually look at is not the same as what they publicly say they should be looking at. So the question is not how to remove the bias; that would be fairly simple really. The real question is whether Twitter thinks the bad PR from exposing people's hypocrisy outweighs the profit from exploiting it.

        1. Anonymous Coward
          Pirate

          Re: Is it the algorithm that's biased?

          The problem is not so much figuring out how to remove bias, but figuring out why they would want to do so.

          Well, I imagine they'd want to do so because a system which systematically discriminates against black people men old men people with polysyllabic names is both illegal and, in fact, wrong.

        2. BinkyTheMagicPaperclip Silver badge

          Re: Is it the algorithm that's biased?

          The reason to do it, quite apart from it being the morally correct thing to do, is that it becomes an unpleasant feedback loop. The more the algorithm concentrates on a narrow selection of images, the more it alienates part of the user base.

          After a certain point any service that relies on mass contribution mostly stops being a technical problem, and starts to be about maintaining the community. If demographics leave they probably won't return, and if they achieve critical mass elsewhere that may spell irreversible decline for the service.

  6. Mint Sauce
    Terminator

    Whenever I see stories like this, I think back to my undergrad days, and a project to write a neural network to recognise patterns. (Written in Eiffel - is that still a thing!?) Long story short, when given a pattern it ought to recognise, it always came up with the inverse of the 'right answer' as the result.

    When asked why, the lecturer's response was (and I'm paraphrasing here) "Fuck knows".

    In summary, nobody ever knew how any of this stuff works, and probably still doesn't ;-)

  7. Craig 2

    "The target model is biased towards...depictions of people that appear slim, young, of light or warm skin color and smooth skin texture, and with stereotypically feminine facial traits"

    So, like most of the Western world then. No surprise another AI has the same biases as it's creators. (no matter how much they claim otherwise on social media)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like