back to article DeepMind working on distributed training of large AI models

Is distributed training the future of AI? As the shock of the DeepSeek release fades, its legacy may be an awareness that alternative approaches to model training are worth exploring, and DeepMind researchers say they've come up with a way of making distributed training much more efficient. DeepSeek caused an element of panic …

  1. Fonant

    Still Bullshit

    More efficient bullshit generation? Excellent news!! </sarcasm>

    1. Catkin Silver badge

      Re: Still Bullshit

      Depending on whether you view the CCP as virtuous or cretinous, there's less or more bullshit respectively from it on topics that reflect poorly on said government.

    2. Mentat74
      Mushroom

      Re: Still Bullshit

      You mean : More efficient stealing of everyone's copyrighted data...

  2. Howard Sway Silver badge

    hey've come up with a way of making distributed training much more efficient

    Yeh - get other people to pay the electricity bill by parcelling out the work to anybody who logs into Google on their PC or phone and have it running there. More Deep Mined than Deep Mind.

  3. Tron Silver badge

    Distributed systems are the future.

    But AI is not the best use of them. Distributed social media is a much better option than the traditional, centralised approach.

  4. Pascal Monett Silver badge
    Stop

    "distributed training"

    So, if I understand correctly, spreading personally indentifiable information over an even greater amount of privately-owned, billionnaire-funded companies is the solution to our future ?

    Count me out.

    1. LionelB Silver badge

      Re: "distributed training"

      Unfortunately, it is becoming increasingly difficult to count oneself out.

  5. HuBo Silver badge
    Terminator

    Smart kids on the pâté de maisons

    Cool to see those French kids (eg. Arthur Douillard -- Dipaco, DiLoCo, and Louis Fournier -- Wash: comms-efficient weight shuffling & avg) doing great work on distributed computations (hopefully applicable to actually useful computation some day too, like FP64 HPC for CFD, Maxwell eqs, ...!).

    Can't help to notice also that the next item in the TFA-linked "Import AI newsletter" is AI self-replication, whereby an agentic-enhanced LLM, prompted to "replicate yourself", might just do so "with no human interference" until full completion of the RotM.

    Wonder if that works with DeepSeek ... (the root incarnate of all this evil?)

  6. harrys Bronze badge

    reality hits home when u run it locally

    told it to write a bash script to email out when resources hit a particular value

    it did it fine, but....

    looked at the cpu usage history and the felt the extra heat generated in the room

    my brain could have done it for a few watts, or found the script somewhere online

    never again will i run a local model, this stuff is replicated millions of times right now in countless racks in data centres around the world

    its even worse then the shite that is called crypto mining

  7. Andrew Scott Bronze badge

    most of the llm's i've played with stay somewhere in the ballpark when you have a "conversation" or ask a question but my limited experience with deepseek-r1 is that those responses are often not on the same planet. i believe it's the worst model out there. it's responses aren't consistent within a single paragraph of output. llama3.2 does a better job and it's half the size. Really can't understand why nvidia investors were so worried about deepseek.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like