back to article What happens when we can’t just build bigger AI datacenters anymore?

Generative AI models have not only exploded in popularity over the past two years, but they've also grown at a precipitous rate, necessitating ever larger quantities of accelerators to keep up. Short of a breakthrough in machine learning and with power becoming a limiting factor, AI's continued growth may ultimately hinge on a …

  1. Doctor Syntax Silver badge

    Data centre capacity is going to become very cheap once the bubble bursts.

    1. Anonymous Coward
      Anonymous Coward

      Doesn't matter once they have all the money available.

  2. prh99

    The plateau can't come soon enough cause I can't wait for the AI hype to die and the bubble to pop. With any luck it will take some executives down with it.

    1. Anonymous Coward
      Anonymous Coward

      I wonder how much tasty kit will be available for how little when the AI bubble goes pop

  3. AndrewTR

    Better algorithms?

    When throwing more hardware at a problem becomes impractical, perhaps it's time to try inventing better algorithms. Just as DeepSeek has done this week, with the open source release of their R1 model. Its performance is on par with OpenAI's o1, and its training supposedly cost only $5.5 million (DeepSeek is based on China, so they are GPU-constrained). By the way, how come I haven't seen this story on The Register yet? Silicon Valley is freaking out about it

    1. diodesign (Written by Reg staff) Silver badge

      DeepSeek

      If there's one thing we don't do is freak out about model releases. We do in fact have a piece coming out soon not only about DeepSeek but how to run it locally to try it out.

      Edit: It's now here :)

      C.

      1. HuBo Silver badge
        Windows

        Re: DeepSeek

        It'll be neat to see some "hands-on" with DeepSeek indeed (and safety considerations wrt data exfiltration, if relevant ...).

        But to the networked datacenter approach discussed here, I thought this was the whole idea behind yesterday's Stargate project, whereby Oracle was putting together 10's of datacenters, and then 10's more (all using Ampere Altras, or AmperOnes, or whatnot seeing how SoftBank is involved ... or maybe Graces), with a goal of linking them together to support training of OpenAI LLMs with trillions+ parameters ... (for whatever purpose, ahem!)

        ... and then, following from AndrewTR here, and possibly martinusher there, it seems that folks in China (Telecom) might have used exactly that kind of datacenter-linking approach last september/october to train their trillion parameter LLM (or maybe 100 billion) on somewhat "underperforming" locally-made chips. An interesting subtext might be that if networking bottlenecks your comps anyways, then some lower specced, older generation, cpus and accelerators might end up sufficing in the leaf nodes ...

        Then again, that could just be me somehow reading between lines of pixels, and then extraexfoliating a bit, in an antialiasing kind of way ... (or not?)!

        1. HuBo Silver badge
          Thumb Up

          Re: DeepSeek

          BTW, I'm glad for this here TFA focusing on the networking aspects of multi-datacenter computation (with expert POVs: Boujelben, Shainer, Wilson) as the WhiteHouse Press briefing on Stargate was quite short on tech details, except for Larry's optimism telling us they were gonna cure cancer with early AI detection and 48-hour robot-production of the corresponding anti-cancer vaccines! (so much the better if it works, but not holding my breath yet ...). Thanks for that piece!

          1. Anonymous Coward
            Anonymous Coward

            Re: DeepSeek

            "cure cancer with early AI detection and 48-hour robot-production of the corresponding anti-cancer vaccines! (so much the better if it works, but not holding my breath yet ...)."

            FYI:

            'Holding your breath' is a known cure for cancer (in fact most 'everything' !!!) ... it just has a 'few' downsides that need sorting out ... much like 'AI' !!!

            :)

            1. O'Reg Inalsin Silver badge

              Re: DeepSeek

              How fortunate! That should just balance out the 6+ months extra wait for the cancer test insurance approval due to AI generated insurance claim denials.

          2. 0laf Silver badge

            Re: DeepSeek

            Assuming it can generate the cure for cancer do you think you can afford to buy it?

    2. klh

      Re: Better algorithms?

      DeepSeek, as most of these models, is not Open Source.

      1. O'Reg Inalsin Silver badge

        Re: Better algorithms?

        On that factual point, I think you are wrong. See tech crunch article "Former Intel CEO Pat Gelsinger is already using Deep Seek instead of Open AI at his startup Gloo".The source AND the weights are already on Hugging Face.

    3. shokk

      Re: Better algorithms?

      Nobody bothers with saddling DeepSeek with ethical questions like where they got their training data from?

  4. Omnipresent Silver badge

    bunch of mac minis

    That's what some research facilities do.

  5. Howard Sway Silver badge

    What happens when we can’t just build bigger AI datacenters anymore?

    If the AI is so good that all this money needs to be spent on it, just ask the AI to answer that question.

    If it can't, then don't bother..

    1. Lazlo Woodbine Silver badge

      Re: What happens when we can’t just build bigger AI datacenters anymore?

      It came back with the answer "42"

      We just need a bigger AI to work out the question...

  6. Eecahmap

    Asimov had the answer . . .

    . . . in "The Last Question".

    1. Christoph

      Re: Asimov had the answer . . .

      But watch out for Fredric Brown's "Answer"

      1. Anonymous Coward
        Anonymous Coward

        Re: Asimov had the answer . . .

        Alternative url:

        https://rowrrbazzle.blogspot.com/2016/06/answer-by-fredric-brown-full-short.html

        1. Doctor Huh?

          Ellison might have the more applicable take on it

          https://en.wikipedia.org/wiki/I_Have_No_Mouth,_and_I_Must_Scream

      2. 0laf Silver badge
        Pint

        Re: Asimov had the answer . . .

        I've been trying to remember the name of that short story for nearly 40yr. You sir are a hero.

    2. Andrew Scott Bronze badge

      Re: Asimov had the answer . . .

      better answer Clarks the 9 billion names of god.

  7. Androgynous Cow Herd Silver badge

    Lots of hand waving around latency…but no answer.

    Hollow core fibre etc and good job calling out the some challenges to this approach. But I am very skeptical that the speed of light as a barrier will be overcome since light is exactly what is traveling down those fibers. Transactional speed measured in milliseconds is way, way too slow for computation at this scale. Locally, sure, Infiniband or RoCE for interconnects but that is gonna stink on ice after just a few kilometers.

    Mix in high bandwidth with computational latency… usable solutions are measured in meters, not kilometers. The breakthrough that would solve that would have a far bigger impact than all the generative AI nonsense combined.

    Reads more like an anal-list talking blue sky and unicorns.

    1. HuBo Silver badge
      Alien

      Re: Lots of hand waving around latency…but no answer.

      Then again, if one follows W.J. Wilkinson's lead (Journal of Scientific Exploration Vol. 37 No. 1 (2023) -- Anomalistics and Frontier Science), it might just be possible to apply conscious intent to influence a quantum-entangled link and so demonstrate superluminal communication:

      "non-local, i.e., superluminal, communication of some sort is going on all around us and maybe not just at the quantum level"

      Hmmmm ... either that or establish an Einstein–Rosen bridge between datacenters ... Event horizontally, with traversable wormholes and related Visser Roman rings, it should be possible to have distributed LLMs trained well before the datacenters are even built (iiuc)!

      Investing in Physics education and research might well provide better bang for the buck (long-term) than AI at this juncture then!?

  8. Sparkus

    Power *and* cooling

    are the limiting factors.

    The greed of local councils to fill unused industrial estates and 'idle' farmland will never be satisfied. Some other limiting factor, such as moral outrage over the unproductive consumption of energy resources and adverse impact on the reliability of energy grids will have to come into play......

    GPU constraints? There are some arguments that a 256 bit enhancement to AltiVec has the potential to outperform Nvidia silicon on far cheaper and less energy hungry hardware. if that come to fruition (Power 11/12 anyone?) all real-estate and GWatt calculations are off.

    And remember, at the end of The Forbin Project the Colossus / Guardian symbiote demanded that the island of Malta be evacuated and hollowed out as an AI fortress / core.

    1. Androgynous Cow Herd Silver badge

      Re: Power *and* cooling

      Cooling *is * power from a density standpoint.

  9. Paul Hovnanian Silver badge
    Boffin

    A Beowulf Cluster ...

    ... of data centers.

  10. Anonymous Coward
    Anonymous Coward

    Haven't you watched the Matrix?

    COmputers and machines cover the planet while humans all lie in stasis dreaming.

    1. blu3b3rry Silver badge

      So that's where Uncle Zuck got the idea of his "Metaverse" from.

  11. Jason Hindle Silver badge

    Those who do more with less will likely win in the longer term

    The compute cost of doing anything with OpenAI's most advanced model is horrendous, yet others get close with far fewer resources. As anyone who squeezed a flight simulator into 1k will tell you, scarce resources can drive efficiency and innovation.

  12. nemecystt

    Emperor's new stilts

    Anyone who knows enough about massively parallel computing and networks operating over large distances with the unavoidable long latencies knows that this idea will not pan out very well. The scale-up in performance will be much less than the scale up in hardware required.

    The problem would be less bad with a totally feed-forward algorithm (which the training of NNs is not), but you still have the feedback inherent in the network communication protocol.

    Many of these AIs (LLMs in particular) are reaching a bottleneck already, having exhausted the available training data anyhow.

    But they'll still manage to fool the investors for a while.

    1. dubious

      Re: Emperor's new stilts

      Yes, I was going to say that this doesn't seem to be a realistic as well.

      Performance scaling isn't close to being linear even in a fully meshed system, and most HPCs are divided up into cells, pods, and racks that you already try to keep your job inside to minimise unacceptable latency. Few jobs are going to be suitable for going distributed-dc compute, certainly not current AI training.

      Even Frontier classifies a large job to as one which takes up 1/4 of the machine, iirc.

  13. Anonymous Coward
    Anonymous Coward

    What service do you sell and who do you sell them to once AI has replaced all workers?

    I'm not sure even billionaires need an enterprise grade AI to count their money or lock the doors on their bunkers.

  14. shokk

    Go up

    Easy. Just like food farming in Nordic countries have done, bit farming datacenters will go vertical.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like