back to article How Facebook uses public videos to train, deploy machine-learning models and harvest those eyeballs

Facebook this week revealed an internal project to create machine-learning models that can understand visual, audio, and written content from videos publicly uploaded to its social network. One of the models, known as Generalized Data Transformations (GDT), is now used on Instagram. Users viewing short video recordings, or …

  1. Tron Silver badge

    It's a world wide web, not your local pub.

    'Public' means public. Use the privacy options if you don't want something to be public, your tweets to go viral, or your photos recycled as memes. Or someone to train their terminators on your stuff.

    I'm amazed that people broadcast their views and post their photos on social media and forums and then complain that folk hurl abuse at them. The internet is not your lounge or your local pub. It's a big world out there. A percentage of people will disagree with absolutely anything you say, and a percentage of them will be rude, aggressive, racist, misogynist and quite nasty. A small percentage of a very large number is a lot of people and a lot of negative comments/hate. I've had endless amounts of abuse over the years. At first it gets to you, but after a while you get used to it. It's a learning process. We are acclimatising to the internet. All new tech requires that. Print, radio, TV and now the net. The solution is not to pick those who have a bad experience out of the crowd, magnify it, and then exploit it, as governments do, as an excuse for censorship. 'Public' means public. Understand that. It's not a tech problem but a human nature problem.

    Google are making a rod for their own backs with the arrogant, petty attitude of some senior staff. Having enough lawyers to squash and silence staff or competitors may be the American way of business, but it gives governments their point of leverage. Big tech does not appear to understand that governments have been taking down the competition for centuries and are rather good at it. These are people who can turn them off, regulate them out of existence, use them as a revenue stream via fines and taxes, or recycle them as an outsourced provider of a surveillance state infrastructure. Big tech's arrogant and unethical behaviour is a free gift to the people they should really fear.

    1. ecofeco Silver badge

      Re: It's a world wide web, not your local pub.

      Good points, but since when did psychopathic boards of directors every use good sense?

      Also, the comeuppance usually comes far too late for the lives they've damaged and the money they've made.

      1. John Brown (no body) Silver badge

        Re: It's a world wide web, not your local pub.

        "Good points, but since when did psychopathic boards of directors every use good sense?"

        Almost certainly never. However, it's worth noting that a psychopathic board of directors may have a different view and conception of ethics than that the general population. Ethics don't have to be societally good although that is the perception most people would like them to be. Clearly Google Ethics Committee is only good for Google if the people on the committee agree with Googles conception of ethics.

    2. 89724102172714182892114I7551670349743096734346773478647892349863592355648544996312855148587659264921

      Re: It's a world wide web, not your local pub.

      >Big tech's arrogant and unethical behaviour is a free gift to the people they should really fear.

      It won't be long before Government and Big tech are one and the same: President Zuckerturd...

  2. DJV Silver badge

    "videos of tractors"

    Wooo! Tractor porn - can't get enough of it!

    1. chivo243 Silver badge
      Coat

      Re: "videos of tractors"

      I love a good tractor pull!

    2. Korev Silver badge
      Coat

      Re: "videos of tractors"

      Ewww, I think I'm going to bale out of this thread now...

      1. Anonymous Coward
        Anonymous Coward

        Re: "videos of tractors"

        We can combine our efforts on that.

        1. Fruit and Nutcase Silver badge
          Pint

          Re: "videos of tractors"

          I've got a brand new combine harvester...

          https://www.youtube.com/watch?v=tb63PdPweDc

    3. ecofeco Silver badge

      Re: "videos of tractors"

      Just... just watch out for those PTO ones. The way they hook up is kinky. Real kinky.

  3. Mike 137 Silver badge

    machine-learning models that can understand visual, audio, and written content

    "Understand" hardly seems the most appropriate word. Current machine learning is essentially template building and matching, which, although a component of understanding, falls far short of its totality for intelligent beings. I suppose though it depends to some extent on the standards you're prepared to accept. The word is used by politicians and bureaucrats so the lower bound might possibly be akin to mere pattern matching.

    1. John Brown (no body) Silver badge

      Re: machine-learning models that can understand visual, audio, and written content

      And then using that pattern matching for recommendations to users almost always seems to fail. It's precisely this kind of pattern matching recommendations which send people down the rabbit hole of conspiracy theories such that they end up inside of toxic social bubbles.

      Most people have multiple interests. Someone who just watched a video of a tractor may well like tractors. But the odds are that they have multiple interests and do not want a constant diet of tractor videos.

  4. Ken Moorhouse Silver badge

    Tractors

    One of the problems with the web is getting side-tracked. So someone searching for a tractor might inadvertantly click on The Wurzels song, and then have to wade through loads of links for other -er- novelty songs, before being able to get back onto their primary goal again. If such choices are sponsored then one can imagine who has the bigger wallet: some media corporation, or a manufacturer of tractors.

    I'm sure others can think of better examples, many of them being NSFW.

    1. ThatOne Silver badge

      Re: Tractors

      The "AI chose for you" part will be vanishingly small compared to the "sponsored content" part.

      After all, the point isn't to be helpful, but to make money.

    2. You aint sin me, roit
      Trollface

      The gateway to Wurzel porn...

      You start off innocently looking at tractor photos, then the AI suggests combines...

  5. ThatOne Silver badge
    Facepalm

    Reality check please

    > have urged [...] academic conferences to decline funding from the company

    Sure, will happen. Everybody knows research is just rolling in dough and can afford to be picky. "Oh no, your money smells, we only accept ethical organic free-range vegan money without Gluten."

    1. John Brown (no body) Silver badge

      Re: Reality check please

      FWIW, that is happening now and to some extent has always happened. It's not as common as some would like, but there are organisations who choose an ethical stance based on their members or public perception and then stick to it. Many banks, for example, try not to invest directly or indirectly in arms manufacturers. Then there's the concept of "blood diamonds". Not to mentions "modern slavery" and "sweatshop labour", which most companies will either steer away from or at least distance themselves from so they can blame a sub-contractor if it comes to light.

      There have been stories in the news recently of universities turning down grant money from some orgs seen as unethical, eg oil companies, coal mining companies etc.

      All this is the exception rather than the rule, but public and peer pressure does work, albeit slowly and over many years.

      1. ThatOne Silver badge

        Re: Reality check please

        > there are organisations who choose an ethical stance

        True, but they have to be able to afford that stance, and academia is much more fragile in that aspect. Labs saying "Well, we simply can't take that big grant of unethical money. We need to close shop and go walk dogs or pack hamburgers." are (literally) vanishingly rare.

  6. iron Silver badge

    So their AI can detect the subject of a video to show you more of the same. How about using it to find illegal videos of shootings, torture, etc and remove them from FB? Nah just ruin the lives of poorly paid human moderators instead. They're probably cheaper.

    1. CrackedNoggin Bronze badge

      Part of that problem is that AI hasn't yet got that far. Even if AI detects 90+a % of the nasty stuff, and of that only 90+b% is actually nasty stuff, there is still way "too much" (*open to interpretation) getting through.

      > "They're probably cheaper."

      Certainly CPU/GPU time and electricity costs are constraining factors - however even in well-endowed research centers I do not believe AI has yet reached the maturity needed to match the judgement of those underpaid moderators. And maybe if(*)/when it does, the price of that AI would drop like a rock within a few years. Personally, I think the limit might be in the fabric of hardware - something less rigid 2D and more flexible 3D will yield exponentially better results - but that's just hyperbole.

      (*"If" because nature and/or human nature could at moment destroy the environment necessary to sustain further research).

    2. Anonymous Coward
      Anonymous Coward

      I believe they do already use automated methods - the problem is that the quantity (~70 million items in 2018) and content means they get humans to assist with cases that potentially require the involvement of security services or law enforcement.

    3. ThatOne Silver badge

      > How about using it to find illegal videos of shootings, torture, etc

      First of all, FB won't remove its moneymaker. It's not aunt Mary's cake recipes which attract viewers.

      Second, AI is actually just A: It can't make the difference between reality and games (for instance), so it will cause a lot of collateral damage. It might help filter out suspects, but it still requires a human to take the final decision.

  7. CrackedNoggin Bronze badge

    "Timnit Gebru and Margaret Mitchell were expelled from Google after they pushed back on management, who had asked them to remove their names from a paper scrutinizing the social and environmental impact of large language models, like the ones used by Google."

    First priority, I reckoned, would be to browse the paper at the center of the controversy. But neither of the links given seemed to reference the paper - instead just covering the fallout. Maybe my Foo is lacking. My Foo's fault or otherwise, I cannot even attempt to make sense of the information without that data.

    1. CrackedNoggin Bronze badge

      OK, I found the paper [ https://faculty.washington.edu/ebender/papers/Stochastic_Parrots.pdf#cite.vaswani2017attention ]

      Quote 1:

      "While the average human is responsible for an estimated 5t of CO2 per year,the authors trained a Transformer (big) model with neural architecture search and estimated that the training procedure emitted 284t of CO2."

      Quote2:

      "Encoding Bias -- It is well established by now that large LMs exhibit various kinds of bias, including stereotypical associations [11,12,69,119,156,157],or negative sentiment towards specific groups [61].

      To put Quote 1 that in context, according to the US Energy Information Admin [https://www.eia.gov/environment/emissions/carbon/] total US 2019 fuel-originated CO2 was 5,130 million tons (metric tons).

      Training "a transformer" costs some fraction of millionth of US total yearly CO2 footprint. How often is it run? I don't know but according to this Technology review article [https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/] it costs 1 to 3 million dollars a shot, so I am guessing far less than everyday.

      IMO - well worth it for groundbreaking R&D.

      About Quote 2, Yes of course any learning system has bias - that's a mathematical fact. Yes, there are some terrible systems out there claiming to be "AI" that are just reinforcing unjust systems, e.g.

      [https://www.wired.com/story/algorithms-supposed-fix-bail-system-they-havent/], but those probably have nothing to do with the systems Google is developing, and everything to do with people setting them up purposely to get a desired desired result, or being misused.

      IMO - that's not a reason to stop AI research. That is a reason to monitor how AI is being used and call out when it being used incorrectly or for bad things (inside or outside google).

      Reading the whole of Timnit Gebru's paper is really tedious because it's a long list of whataboutisms, with no actionable propositions (other than the implicit idea to just shut down or at least downscale the LM research because it's generally a bad, wasteful, and prejudicial thing.)

      Timnit Gebru's job was not the right job for someone who thinks that AI research is inherently an evil thing, and can see no positive potential (she mentioned not one).

      Instead it's a job for somebody who recognizes the potential benefits of AI, but can also actually provide better focus on actual unfair and biased application of AI, explaining it in straightforward everyday language that a police chief, DA, or average person can understand. Such a person can use the platform Google provides to amplify that message to good effect.

  8. Winkypop Silver badge
    Alert

    Signed up? Signed up to Facebook?

    Sir, you’re very much mistaken.

  9. Bitsminer Silver badge
    Joke

    Videos similar to....what again?

    if someone on Instagram tends to watch videos of tractors, the GDT recommender system will highlight other videos of tractors.

    And if, hypothetically, GF sends me a short video telling me to F*CK OFF and ending the relationship, can I then expect to be offered more videos of the same?

    Really? Might be fun, I think. I should ask her to send me one.

  10. This post has been deleted by its author

    1. You aint sin me, roit
      Alien

      Meteorites

      Best use I've seen of doorbell cameras is night sky monitoring and ufo tracking.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like