back to article China trains 100-billion-parameter AI model on home grown infrastructure

China Telcom's AI Research Institute claims it trained a 100-billion-parameter model using only domestically produced computing power – a feat that suggests Middle Kingdom entities aren't colossally perturbed by sanctions that stifle exports of Western tech to the country. The model is called TeleChat2-115B and, according to a …

  1. O'Reg Inalsin

    Bonsai tree

    It's quite possible that China, forced to grow with constrained computing ability, will develop more efficient systems, sooner.

    I doubt that OpenAI, just because it gets the most money, will have - proportionally - the best long term results.

    1. GenericLeftieWhackjob

      Re: Bonsai tree

      Same thing with the protectionist hysteria over Chinese cars in the US. Regulations demanding total disconnection and other "pile-ons" to protect Ford et al, may end up recreating the Toyota/Honda massacre of the 80s as they're forced to build a good car that just happens to be electric, instead of these hype-beast shonky "EVs" that spiy on you and make you watch ads in-cabin.

      1. steviebuk Silver badge

        Re: Bonsai tree

        The issue with Chinese evs is mainly their batteries randomly detonating. Just look it up, plenty of examples.

        1. IGotOut Silver badge

          Re: Bonsai tree

          Is that because there are more than anywhere else.

          Put it a bit more in balance

          https://www.honestjohn.co.uk/the-latest-car-fire-statistics/

    2. Anonymous Coward
      Anonymous Coward

      Re: Bonsai tree

      The Soviets got VERY good at optimisation of mathematics to run on limited computing hardware for the very reasons you've specified.

      There is software in widespread use in the oil and gas sector, based on very efficient algorithms devised behind the iron curtain. Admittedly, usually with hundreds of MB's of user interface wrapped over the top over the very tight engine underneath.

  2. druck Silver badge
    Happy

    What can't it say

    Forget about the training, just think of how much computing power will be needed to ensue it doesn't mention the 4th of June or any A.A. Milne characters.

    1. munnoch Bronze badge

      Re: What can't it say

      "trained using 10 trillion tokens of high-quality Chinese and English corpus"

      The high quality Chinese corpus is easy -- "We love President Xi" 10 trillion times over.

      But, high quality English corpus???? If only there was one...

      1. Anonymous Coward
        Anonymous Coward

        Re: What can't it say

        Indeed not. We shall remain safely in the high-quality lead IMHO until such time that homegrown Chinese AI can convincingly replicate the hypnotic prowess of Gemini's most insightful poop fart poop fart poop fart poop Podcast ... (high tech at its best!)

        1. GenericLeftieWhackjob

          Re: What can't it say

          We cannot allow a mineshaft gap!

          ...Just ignore mobody can fuckin read well enouf to turn on the cumpuuters because we've let morons overrun the education and training sector...

          1. PhilipN Silver badge

            Re: What can't it say

            You beat me to it.

            "high quality English corpus" - chuckle! Try talking to almost anyone in the UK and subtitles if not always essential would be a great help. And don't even get me started on Written English.

  3. Anonymous Coward
    Anonymous Coward

    Is 10 trillion tokens good?

    How does 10 trillion token training data size compare to what Llama and o1 were trained on?

    1. Anonymous Coward
      Anonymous Coward

      Re: Is 10 trillion tokens good?

      A Massive Leap Forward: Comparing 10 Trillion Tokens to Llama

      10 trillion tokens is a staggering amount of training data, significantly larger than what models like Llama were trained on.

      Llama: Trained on a dataset of 1.4 trillion tokens.

      This means a 10 trillion token dataset is approximately 7 times larger than Llama's. Such a massive increase could potentially lead to significant improvements in a model's capabilities, including:

      Enhanced language understanding: Exposure to a wider variety of linguistic patterns.

      Improved text generation: Ability to produce more coherent and informative text.

      Better performance on various tasks: Such as translation, summarization, and question answering.

      Specific details about Llama:

      Model size: Varies depending on the specific version, but typically ranges from 7 billion to 24 billion parameters.

      Architecture: Based on the Transformer architecture, a popular choice for language models.

      Training data: Primarily sourced from public datasets like BooksCorpus and Wikipedia.

  4. Anonymous Coward
    Anonymous Coward

    tofu telecom

    If this isn't tofu-style AI that plagues them, then maybe, maybe not.

    Yet, China Telecom are the sleeping dragon of the East. Once they eclipsed Vodafone it was just a matter of time before they started flexing muscle outside of China

    Vodafone, are you listening? You lost the mobile internet war to Apple. But now you have something they don't - the network. So, do like China Telecom have attempted: use real-time streams of mobile phone data users to train a quasi live LLM.

    It is a complete business moat system and Voda, there is no way you can mess this up.

    1. martinusher Silver badge

      Re: tofu telecom

      I remember reading something recently about how the Chinese had come up with a distributed architecture where all the computing elements didn't need to be tightly coupled. This is a logical development and implies that there's been some work on the quality of information exchange between nodes (here all I have to go on is some very old knowledge of perceptrons / neural networks so I'm probably totally out of my depth).

      But it all boils down to the idea that a) there's a hell of a lot of them and b) some of them are really clever. So this idea that we can somehow keep them subservient is laughable, its just isn't going to happen, and wasting effort on this rather than getting stuck in and competing peer to peer is just wasting valuable time and resources. Our hamfisted approach did them a favor by giving them goals to focus on.

      1. botfap

        Re: tofu telecom

        The Chinese have been focused on asynchronous training for AI from the start. The west has so far been running a purely synchronous model. The wests approach works better at small scale but fails to scale outside of a single data centre cluster

        Western companies are starting to wake up to that mistake however and Google is currently leading the curve when it comes to asynchronous training. OpenAI has plans to explore asynchronus training too but they are all behind the Chinese in terms of asynchronous experience at the minute

        Interesting article on asynchronous training ambitions of the western AI companies:

        Multi-Datacenter Training: OpenAI's Ambitious Plan To Beat Google's Infrastructure

        https://www.semianalysis.com/p/multi-datacenter-training-openais

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like