
In character
Literally everything Google has ever done has hinged on brute force in one way or another.
The latest iteration of Google’s custom-designed number-crunching chip, version three of its Tensor Processing Unit (TPU), will dramatically cut the time needed to train machine learning systems, the Chocolate Factory has claimed. Google CEO Sundar Pichai revealed the third version of the Google-crafted matrix math processor …
Yes, that's their trick, and it's a good one: find problem areas which are amenable to brute force / scale and then throw brute force & scale at it. It's the 'finding problem areas' bit that's hard.
Amazon did the same thing. I used to work for a dot.com which sold things which were typically customised per-person. We used to sneer at Amazon for having picked the easy problem of selling books which are all the same, while we had to spend a lot of time doing clever tricks to deal with our products which were all different. At some point I had a moment of insight: picking things to sell that were all the same was why Amazon were succeeding and we weren't: Amazon's trick of picking the easy problem to solve while we struggled with the hard one was because they were smarter than us.
Strongly disagree. Look at Google text to speech. They are using a neural network at 16k cycles a second. But then offering the service at a competitive price. That is not happening with a FPGA.
The old method take minimal computation work versus this method. Yet Google is offering at a competitive price because they have significantly lowered their inference per joule compared to competition.
So the feared impending A.I. war has been downgraded to a large near future advertising misrepresentation bruha.
The A.S.A will be on the forefront, in an underground guerilla war fighting for mankinds survival?
Too early in the morning, images from terminator movies blending sureally with dull court dramas...
Is that picture in the article for real? They bring in cold liquid, heat it up on one TPU, pipe it to the next TPU, heat it some more, send it back to the cooling unit?
So presumably TPU #1 'mysteriously' always runs overly hot and has to throttle back, while TPU #0 gets to blithely crank along at full speed?
The chips do not have to run at the same temperature, as long as they are within tolerance. The efficacy of this will be down to the flow rates and the radiator at the back.
I certainly would not want my case to be filled with pipework that has to be removed every time a configuration change is needed, things could get messy very fast.
Fashionable coolant colours do not wash out of clothes either....
I'll bet you that just out of shot are the other two ASICs per pod also connected in series, so 4 in series in total. Temperature difference between them will still only be a few degrees, water cooling is very effective at this, the main power limit is the radiator.
All we care about is the joules per inference. The processing power is really more of a pissing contest. Google is offering the TPU 2.0 for about 1/2 the price of using Nvidia hardware on AWS doing the same amount of work.
What will be curious how much Google was able to lower additionally with the TPU 3.0? But it appears Google is now 2 generations ahead of Nvidia as Nvidia was yet to catch up to the TPU 2.0.
But it seems pretty obvious why Google needed the TPUs. They are doing the most real sounding text to speech I have ever heard. They are using a neural network with audio at 16k cycles a second. The computational power would be just incredible to get the better result. But the problem is the old method used very, very little computational power. BTW, I would consider text to speech to now be a solved problem.
So Google had to significantly lower the cost as in joules for the compute or they would not have been able to offer the service at a competitive price. There is just no third party silicon that can do that. State of the art is Nvidia and they do not have anything close as of today.
It is also how they are able to do the John Legend voice. We now have 6 voices but would not expect to see a ton of them. The issue is the most power used today is memory access and to move the model into memory would be too expensive.
I will be most interested to see how long it takes for anyone to be able to offer what Google is doing in this area. They continue to just keep raising the bar. We need someone to do it. Apple appears to have gone asleep. MS could have been doing it but without mobile kind of just does not mater.