back to article Supercomputer to train 176-billion-parameter open-source AI language model

BigScience – a team made up of roughly a thousand developers around the world – has started training its 176-billion-parameter open-source AI language model in a bid to advance research into natural language processing (NLP). The transformer architecture makes it easier to train large neural networks more efficiently. Powered …

  1. Pascal Monett Silver badge
    WTF?

    A 176-billion-parameter NLP

    I'm sure SAP could make a mint selling armies of consultants to configure that beast.

    Language is complicated, for sure, but how long does it take to configure 176 billion parameters ?

    It seems to me it takes way too long to be useful. Tell me there is some form of automation here.

    1. Mike 137 Silver badge

      Re: A 176-billion-parameter NLP

      With such a vast number of parameters, finding optimum values may become effectively impossible - chaos could easily set in. Quite apart from which, considering how simply language appears to operate in the bio-space (despite all its ambiguities), there's a suspicion that this level of parametric complexity suggests the overall model may be wrong. Maybe brute force is not the way language works.

      1. Primus Secundus Tertius

        Re: A 176-billion-parameter NLP

        I once constructed a list of the words I use, to support a spellcheck program. I was using about 5,000 root words; with plurals, verb forms,etc, that became about 20,000 words.

        Select 5 from 20,000 and that is a lot of short sentences, most of which are impossible. The fun starts if you want to guess what the possible ones might mean, for purposes such as translation, indexing and filing, and customer service.

        I suspect the current brute force approach is running out of steam. For example, machine translations are less worse than they were, but still not up to a standard anyone would pay for.

        1. Mike 137 Silver badge

          Re: A 176-billion-parameter NLP

          "machine translations are less worse than they were, but still not up to a standard anyone would pay for"

          We've been working towards a quality someone would feel happy paying for for at least half a century and we're not there yet. The missing component, as always, is understanding. That's what unscrambles the ambiguities of natural language not only according to immediate context, but particularly on the basis of whether it makes sense - something we can all usually decide about but that nobody really understands the mechanism of. It may well be at least partly based on years of physical exposure to language in real world situations - very different from relying on text parsing.

      2. LionelB Silver badge

        Re: A 176-billion-parameter NLP

        "There's a suspicion that this level of parametric complexity suggests the overall model may be wrong. Maybe brute force is not the way language works."

        Doesn't the famous George Box aphorism: "All models are wrong, but some models are useful" apply here, though?

        Brute force is almost certainly not the way language works -- depending on your working definition of "brute force" (my understanding of these kinds of models is that they exploit limited context, possibly limited semantics too, though I'm not sure) -- but if it works for some value of "works" that suffices for you, then it's job done.

        Which is not to say it does/will work. I do a fair amount of optimisation of complex problems, and 176 billion parameters, notwithstanding the size of the training set and concurrent computing resources, sounds frankly nuts.

  2. Anonymous Coward
    Anonymous Coward

    The Long Arm of

    M$.

  3. Neil Barnes Silver badge

    I have to admit I am curious about MS's AI

    I am currently (and have been for some time) trying to learn another language using Duolingo.

    My mobile phone uses Swiftkey (which I think is an MS product) and it's remarkably good at providing 'next word' suggestions in the foreign language. So I wonder is it learning locally for me, or is it chattering with other people doing the same language course?

    Or, of course, some other mechanism of which I am unaware. Anyone know?

    1. Charlie Clark Silver badge

      Re: I have to admit I am curious about MS's AI

      MS bought SwiftKey. It will be using a publicly trained model to make better suggestions for you personally, but will also provide anonymised data hit/miss, etc. back to the mothership.

  4. Doctor Syntax Silver badge

    Has nobody yet worked out that a human infant acquires natural language with a lot less fuss?

    1. Neil Barnes Silver badge

      More fun making them, too...

      1. Doctor Syntax Silver badge

        It's looking after them afterwards that's hard work. I reckon the first 40 years are the worst.

        1. The Dark Side Of The Mind (TDSOTM)

          Re: I reckon the first 40 years are the worst.

          Only the years from 5th to 20th are to be feared. First 5 are giggles and obsessive care, then after 20 they usually get more or less independent... and less of a nuisance. Also, everything that you do (or don't do) in the 5-20 interval will be used at least once against you at some point more than once (if they survive the first time).

          I'm a terrible parent, probably. Time will tell. The same goes with all training models (time and telling), albeit seemingly a bit faster.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like