Data centre capacity is going to become very cheap once the bubble bursts.
What happens when we can’t just build bigger AI datacenters anymore?
Generative AI models have not only exploded in popularity over the past two years, but they've also grown at a precipitous rate, necessitating ever larger quantities of accelerators to keep up. Short of a breakthrough in machine learning and with power becoming a limiting factor, AI's continued growth may ultimately hinge on a …
COMMENTS
-
Friday 24th January 2025 19:17 GMT AndrewTR
Better algorithms?
When throwing more hardware at a problem becomes impractical, perhaps it's time to try inventing better algorithms. Just as DeepSeek has done this week, with the open source release of their R1 model. Its performance is on par with OpenAI's o1, and its training supposedly cost only $5.5 million (DeepSeek is based on China, so they are GPU-constrained). By the way, how come I haven't seen this story on The Register yet? Silicon Valley is freaking out about it
-
Friday 24th January 2025 19:17 GMT diodesign
DeepSeek
If there's one thing we don't do is freak out about model releases. We do in fact have a piece coming out soon not only about DeepSeek but how to run it locally to try it out.
Edit: It's now here :)
C.
-
Saturday 25th January 2025 01:17 GMT HuBo
Re: DeepSeek
It'll be neat to see some "hands-on" with DeepSeek indeed (and safety considerations wrt data exfiltration, if relevant ...).
But to the networked datacenter approach discussed here, I thought this was the whole idea behind yesterday's Stargate project, whereby Oracle was putting together 10's of datacenters, and then 10's more (all using Ampere Altras, or AmperOnes, or whatnot seeing how SoftBank is involved ... or maybe Graces), with a goal of linking them together to support training of OpenAI LLMs with trillions+ parameters ... (for whatever purpose, ahem!)
... and then, following from AndrewTR here, and possibly martinusher there, it seems that folks in China (Telecom) might have used exactly that kind of datacenter-linking approach last september/october to train their trillion parameter LLM (or maybe 100 billion) on somewhat "underperforming" locally-made chips. An interesting subtext might be that if networking bottlenecks your comps anyways, then some lower specced, older generation, cpus and accelerators might end up sufficing in the leaf nodes ...
Then again, that could just be me somehow reading between lines of pixels, and then extraexfoliating a bit, in an antialiasing kind of way ... (or not?)!
-
Saturday 25th January 2025 01:52 GMT HuBo
Re: DeepSeek
BTW, I'm glad for this here TFA focusing on the networking aspects of multi-datacenter computation (with expert POVs: Boujelben, Shainer, Wilson) as the WhiteHouse Press briefing on Stargate was quite short on tech details, except for Larry's optimism telling us they were gonna cure cancer with early AI detection and 48-hour robot-production of the corresponding anti-cancer vaccines! (so much the better if it works, but not holding my breath yet ...). Thanks for that piece!
-
Monday 27th January 2025 11:55 GMT Anonymous Coward
Re: DeepSeek
"cure cancer with early AI detection and 48-hour robot-production of the corresponding anti-cancer vaccines! (so much the better if it works, but not holding my breath yet ...)."
FYI:
'Holding your breath' is a known cure for cancer (in fact most 'everything' !!!) ... it just has a 'few' downsides that need sorting out ... much like 'AI' !!!
:)
-
-
-
-
-
-
Monday 27th January 2025 08:38 GMT Christoph
Re: Asimov had the answer . . .
But watch out for Fredric Brown's "Answer"
-
-
Saturday 25th January 2025 16:13 GMT Androgynous Cow Herd
Lots of hand waving around latency…but no answer.
Hollow core fibre etc and good job calling out the some challenges to this approach. But I am very skeptical that the speed of light as a barrier will be overcome since light is exactly what is traveling down those fibers. Transactional speed measured in milliseconds is way, way too slow for computation at this scale. Locally, sure, Infiniband or RoCE for interconnects but that is gonna stink on ice after just a few kilometers.
Mix in high bandwidth with computational latency… usable solutions are measured in meters, not kilometers. The breakthrough that would solve that would have a far bigger impact than all the generative AI nonsense combined.
Reads more like an anal-list talking blue sky and unicorns.
-
Sunday 26th January 2025 00:25 GMT HuBo
Re: Lots of hand waving around latency…but no answer.
Then again, if one follows W.J. Wilkinson's lead (Journal of Scientific Exploration Vol. 37 No. 1 (2023) -- Anomalistics and Frontier Science), it might just be possible to apply conscious intent to influence a quantum-entangled link and so demonstrate superluminal communication:
"non-local, i.e., superluminal, communication of some sort is going on all around us and maybe not just at the quantum level"
Hmmmm ... either that or establish an Einstein–Rosen bridge between datacenters ... Event horizontally, with traversable wormholes and related Visser Roman rings, it should be possible to have distributed LLMs trained well before the datacenters are even built (iiuc)!
Investing in Physics education and research might well provide better bang for the buck (long-term) than AI at this juncture then!?
-
-
Sunday 26th January 2025 21:03 GMT Sparkus
Power *and* cooling
are the limiting factors.
The greed of local councils to fill unused industrial estates and 'idle' farmland will never be satisfied. Some other limiting factor, such as moral outrage over the unproductive consumption of energy resources and adverse impact on the reliability of energy grids will have to come into play......
GPU constraints? There are some arguments that a 256 bit enhancement to AltiVec has the potential to outperform Nvidia silicon on far cheaper and less energy hungry hardware. if that come to fruition (Power 11/12 anyone?) all real-estate and GWatt calculations are off.
And remember, at the end of The Forbin Project the Colossus / Guardian symbiote demanded that the island of Malta be evacuated and hollowed out as an AI fortress / core.
-
Monday 27th January 2025 07:55 GMT Jason Hindle
Those who do more with less will likely win in the longer term
The compute cost of doing anything with OpenAI's most advanced model is horrendous, yet others get close with far fewer resources. As anyone who squeezed a flight simulator into 1k will tell you, scarce resources can drive efficiency and innovation.
-
Monday 27th January 2025 13:33 GMT nemecystt
Emperor's new stilts
Anyone who knows enough about massively parallel computing and networks operating over large distances with the unavoidable long latencies knows that this idea will not pan out very well. The scale-up in performance will be much less than the scale up in hardware required.
The problem would be less bad with a totally feed-forward algorithm (which the training of NNs is not), but you still have the feedback inherent in the network communication protocol.
Many of these AIs (LLMs in particular) are reaching a bottleneck already, having exhausted the available training data anyhow.
But they'll still manage to fool the investors for a while.
-
Monday 27th January 2025 19:08 GMT dubious
Re: Emperor's new stilts
Yes, I was going to say that this doesn't seem to be a realistic as well.
Performance scaling isn't close to being linear even in a fully meshed system, and most HPCs are divided up into cells, pods, and racks that you already try to keep your job inside to minimise unacceptable latency. Few jobs are going to be suitable for going distributed-dc compute, certainly not current AI training.
Even Frontier classifies a large job to as one which takes up 1/4 of the machine, iirc.
-