back to article Meta's AI-based Wikipedia successor 'may be the next big break in NLP'

Meta has open-sourced a machine-learning resource that could one day supplant Wikipedia as the world's biggest publicly available knowledge-verification database. Dubbed Sphere, it can be used to perform knowledge-intensive natural language processing, or KI-NLP, we're told. In practical terms, that means it can be used to …

  1. The Man Who Fell To Earth Silver badge

    What could possible go wrong?

    So they think Wikipedia is accurate, eh? Explains a lot.

    Let's all ask it about Scotland...

    1. Michael Wojcik Silver badge

      Re: What could possible go wrong?

      Indeed. "Like Wikipedia, but without the accuracy!" is not a motto that inspires. Nor is "Like Google results, except you don't have any context!".

  2. Anonymous Coward

    Another tool for Suckerberg

    I assume we'll see this in Farcebook any day now. Unlike the scientists, they don't care about accuracy. They want eyeballs and engagement for ads. Controversy over accuracy, if anything, increases eyeballs and engagement.

    Too bad this wasn't from a more reputable NLP research source.

  3. Anonymous Coward
    Anonymous Coward

    Disputed comment?

    What if your Sphere search turns up a load of incorrect and/or libellous bollocks? How do you appeal a result from such a system?

    Okay, inaccurate results can be returned from Wikipedia, and their dispute mechanism isn't great.

    But how do you challenge an ML system owned by a twat who's only interested in profit.

    I also accept Wikipedia makes massive amounts of money off the back of real contributors, but I see Sphere being a Bad Thing.

    We need to kill FB/Meta. We can probably leave The Big Z alive as long he's neutered - I'm feeling generous tonight.

    1. skierpage

      Re: Disputed comment?

      "I also accept Wikipedia makes massive amounts of money off the back of real contributors"

      ??!!? The Wikimedia Foundation is a non-profit. It runs no ads, it doesn't track users. It makes its incomparably valuable articles and Wikidata freely available. It asks for donations to pay for the servers, software development, and supporting the volunteers.

  4. unimaginative

    Doing what search engines already do.

    Google already does this.

    Google's results contain a summary at the top

    Google's result seems more informative to me.

    It looks like a search engine with extra buzzword compliance.

  5. Il'Geller

    I did this 20 years ago and the present technology is patented.

  6. Inventor of the Marmite Laser Silver badge

    It's from Meta. I'd trust it about as far as I could throw this planet.

    With one arm tied behind my back.

    And the other one encased in concrete.

  7. Sorry that handle is already taken. Silver badge


    I don't want there to be an encyclopedia article about me. This is going to be a privacy nightmare.

    1. Anonymous Coward
      Anonymous Coward

      I don't want there to be an encyclopedia article about me.

      Don't worry. It will only contain about six lines, and otherwise imply you are the most honest of men.


    2. StevenP

      Re: But...

      I want my entry to be 'Mostly Harmless'.

  8. elDog

    Wikipedia has become a valuable resource because it is properly formatting its results

    This has taken countless hours to put raw information into a digestible format - both for human eyes and computer tendrils.

    I don't see where a new engine just returning links to randomly put together pages is really helpful.:

    1. Falmari Silver badge

      Re: Wikipedia has become a valuable resource because it is properly formatting its results

      Its is just a glorified search engine. It finds web pages selects one and lifts some data from it.

      The example in the article. Who is Joëlle Sambi Nzeba?

      Typing Joëlle Sambi Nzeba into a search engine got me this

      Wonder what it does with a very common name.

      1. Michael Wojcik Silver badge

        Re: Wikipedia has become a valuable resource because it is properly formatting its results

        Doesn't even have to be very common. Once I did a quick web search for my name and made a list of three dozen or so people who have my name but aren't me. This was part of a little project to generate more signature lines for my Usenet auto-sig script, so it was back when I was reading Usenet regularly – maybe 15-20 years ago.

        Just did another one on DDJ and I got through four dozen results before I found one that's me. (I skipped a few aggregator results such as genealogy sites.) And that's despite using my real name online since 1991 and being active in multiple professional fields, contributing to open-source software, etc.

        Google doesn't do much better unless you add keywords.

        Unless your name is quite unusual or you're particularly prominent in some field (my wife is in both categories and pops up immediately in searches), Sphere's likely to generate a mishmash of nonsense for your name, I'd guess.

  9. Anonymous Coward
    Anonymous Coward

    So many problems here

    Firstly, their statements about the quality and reliability of Wikipedia are not just laughable, they are full on howlers. It does lead one to another key issue with both, that Jimbo Wales is no saint, and the unholy alliance between Google/Alphabet and Jimbo's great social experiment has resulted in a crisis where and inherently and intentionally unreliable website run by an opaque and unaccountable organization has been blessed as the default and near authoritative repository of all knowledge.

    Considering that more though and effort goes into international shipping labels every year then the process that allowed Jimbos potato headed love child to take over the internet, I can do little more than cringe at letting one of the few worse organizations on the net wedge itself into the same space. When Facebook's people are either so deluded as to think Wikipedia is a high quality source, or so unwilling to offend the Wikiswarm they praise it anyway, I can't trust their integrity or their works. Certainly only fools would trust the companies intentions.

    Kick the fools at both orgs out of the room and then replace them with the better choice of neither.

  10. DS999 Silver badge

    Wikipedia may have its issues

    But at least is has people preventing a complete garbage in / garbage out "Microsoft Tay" type situation. How would something totally automated avoid stuff like a bunch of people posting "Donald Trump has orange skin because he has ooma loompa disease" and the AI seeing that posted everywhere believing that's the case and ending up stating that as a fact alongside actual facts.

    There will always be "moderator wars" on wikipedia especially about partisan topics (whether political, religious, nationalist, etc.) but at least someone editing a widely viewed page and inserting something both sides agree is false, regardless of motive, would be rejected. How is an "AI" going to do that?

    If it really was intelligent it could tell, but we are nowhere remotely close to achieving that. And if we manage it in my lifetime, I feel confident in saying it won't be Facebook that gets there first.

  11. Yes Me Silver badge
    Thumb Down


    universal, uncurated and unstructured knowledge source

    i.e. less accurate and more full of crap than Wikipedia

  12. Anonymous Coward
    Anonymous Coward

    Meta has open-sourced a machine-learning resource

    beware of Facebook bearing gifts...


    too late, pandora!

  13. Howard Sway Silver badge

    This is the big problem with "Big Data + ML"

    The evangelists and their followers all believe that the truth is whatever the majority of data points say it is. Even if that majority is obtained by a wild web scraping exercise with little control over how the AI makes its decisions. The reason being that this is believed to be somehow "democratic".

    In reality, it's like politicians who pretend the truth is whatever the majority believe it to be, no matter how stupid.

    Farewell to the scientific method, where implausible sounding and weird things like quantum theory do actually get proved to be the truth. The big data true believers have decreed that facts are now confirmed to be true by a flawed process of determining which beliefs are the most popular. Not by expert peer review disqualifying ideas that don't stack up.

  14. Tubz Silver badge

    All controlled by Meta, who we all know are honest, trustworthy, never lie, won't amend data to it's own point of view and will not make any money from it!

  15. Smeagolberg

    Wikipedia - an encyclopaedia, but not as we know it, Jim

    I once worked at a university at which a lecturer used to seed a Wikipedia article with false information before setting an essay on the topic. He would use excerpts from essays quoting his Fake Facts to teach students the importance of skepticism and using reliable sources.

    Wikipedia has a lot of barely literate, jobsworth-type 'editors' who clearly aren't on top of their subject matter and are linguistically challenged, but don't need to be very competent to wield their power, so that's OK.

    There are armies of PR people using it for promoting their clients.

    Some subject areas have good information, but many don't.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like