50GWh? That’s all? For the whole model?
I hate to equivocate too much, but a training run consuming “the equivalent power use of 4,500 homes” while running seems like a pretty good bargain to me.
By comparison there are around 145,000,000 homes in the US, and in 2023 the US consumed, per its energy research agency [0], 11.3 quadrillion BTUs (~3,300TWh) for residential uses (including all residential energy sources, not just electricity). So over the 100 days the model trained, let’s eyeball the consumption of those homes at a little under 1,000TWh. Or 1,000,000GWh.
50GWh consumed in training—that puts us 5 orders of magnitude under the residential consumption for that country over that period, which itself is only 15% of the total energy consumed in that overall economy.
This is the training run—the energy-intensive “building the factory” part of delivering the product. To a back-of-envelope first estimate, the Empire State Building used 57,000 tons of steel [1], and per a random but plausible-sounding web search result, building steel today takes something like 7,500 KWh/ton or so of energy to produce. Suggesting this OpenAI model took somewhere in the neighborhood of 10% of an Empire State Building’s steel to build (ignoring the energy cost of the concrete!).
The Empire State Building has 2.8mm sqft, so even cramming people in at 100sqft each, that suggests it’s serving on the order of 28,000 people’s office needs. OpenAI say the model they built with this energy serves 200,000,000 every week [2]. We’re firmly in apples-to-oranges territory, but still: in terms of humans served, that’s 4 orders of magnitude improvement over building something that cost 10x as much energy just for its steel.
Longer-run, sure, inference is more energy-intensive than status-quo data center activities: it’ll still be more energy-expensive than slinging JavaScript over a wire. And already the big operators are making a splash by tendering contracts for dedicated generation capacity co-located with their bit barns, and tending to prefer nuclear supply for that purpose (however you feel about nuclear, it’s not carbon-intensive).
But OpenAI are already serving their 200,000,000 weekly active users with extant capacity—the same RISE article estimates the total energy cost of serving the whole user base GPT-4 for a year at 91GWh. So with our remaining 90% Empire State Building-worth-of-steel energy budget, we can serve something *10^9 humans for what, 5 more years? I imagine that works out to wildly less energy than asking all those people to schlep to a library to get their questions answered.
There are plenty of thorny problems with the AI stuff. I just don’t understand why the carbon-intensity angle is so salient for critics: these models pose hazards along so many lines, some existential—this just doesn’t feel like a significant one (and it does seem like an especially solvable one).
Refs (I don’t have link permission):
[0] https://www.eia.gov/energyexplained/us-energy-facts/
[1] https://ascemetsection.org/committees/history-and-heritage/landmarks/empire-state-building
[2] https://www.axios.com/2024/08/29/openai-chatgpt-200-million-weekly-active-users