back to article Nvidia says Google's TPU benchmark compared wrong kit

It's not easy being Nvidia: the rise of AI has put a rocket under demand for GPUs, but the corollary to that is World+Dog publishing benchmarks to try and knock Nvidia off its perch. The company is famously touchy about such things – witness last year's spat with Intel over benchmarks it didn't regard as fair. Well, it's …

  1. Ian Michael Gumby

    Google still kicks NVIDA in terms of power...

    Not a fan of Google but 75W vs 250W... that's a lot less heat and power consumption.

    And its still twice as fast as Nvidia.

    I hope Nvidia decides to step up to the challenge and improve performance while reducing power consumption.

    1. Anonymous Coward
      Anonymous Coward

      Re: Google still kicks NVIDA in terms of power...

      I think you've rather missed the point of the article.

      It's only "twice as fast" on the inference phase because it's a single-purpose inference chip.

      It is of no use for training, which is the actually hard bit of machine learning. That is obvious to anyone with a clue because the table figures are comparing TOPS in INT8 to TOPS in 32 bit floating point. That's an order of magnitude more complex a set of operations, so you'll have to forgive them for only using 3.3x more power when everything is going at full pelt (which it never does).

      Putting it in simple terms, Nvidia are saying that the comparison is flawed because the chips do different things and the comparison is unfair because it targets their legacy tech. Admittedly they then go on to make their own comparison, but they've probably got a point. GPUs are readily available and easily targeted in ML frameworks. Custom silicon is not. Given their flexibility they're going to be the superior choice for the foreseeable future, probably at least until FPGA-on-CPU packages from intel come along with decent support in the standard frameworks.

      1. annodomini2

        Re: Google still kicks NVIDA in terms of power...

        My interpretation of the table is that the Tensorflow isn't for training, but run-time operations.

        When you've created your trained environment that is developed to run on this architecture, you can cut your on-going operating costs by using these as the equivalent power consumption is much, much lower.

        If the system can function adequately in fixed point 8bit, then why not run it in that scenario. What is the point of wasting all that power and subsequently money for a field operable system that is adequate for the job.

        Google have the resource to develop something like this and if it either allows them to increase functionality or save money or both, then they would probably invest in it.

        Yes they could buy GPUs off the shelf, but just because something is available to do it this way does not necessarily mean it's the best solution for the job.

        FPGA's are great at being flexible, but they are usually not very power efficient.

        I get the impression you are looking at this from an academic/development environment perspective rather than a production environment perspective, which is where Google are operating these devices (assumed).

      2. Charlie Clark Silver badge

        Re: Google still kicks NVIDA in terms of power...

        I think you've rather missed the point of the article.

        No, the article missed the point of Google's report. They decided to build their own hardware because power consumption was key for their rollout plans. They still rely on nVidia for the training so nothing's changed there and I suspect they're not at all averse to offers for TPU replacement chips from nVidia or anyone else that have better OPs/Watt values.

        1. Sirius Lee

          Re: Google still kicks NVIDA in terms of power...

          Take a vote as I think you make the most relevant assessment. Yes, the Google article made it very clear that operations per watt is their benchmark. That and the ability to scale out. I did not get the impression they cared if it was more operations at the same power level or the same number of operations for less power. By putting out an implementation in silicon they have created a target Intel and nVidea can aim at. I'm sure the Google team is delighted early indications are that Intel and nVidea want to play their game.

  2. Solarflare

    Aren't Nvidia getting upset about Google doing to them exactly what they did to AMD? Comparisons like this are never massively reliable, they are created to gather interest and generate sales. "We are 80% better than x company's products" sounds a lot better than "We are 80% better than x company's old product line, and we might be alright in comparison to their new line, who knows?"

  3. Anonymous Coward
    Anonymous Coward

    If Google is waging war here...

    Does this mean we'll have another contender for shittiest closed source Linux driver contest?

  4. jacksmith21006

    Google shared TPUs from 2 years ago. Google never shares unless on next generation. But Nvidia really does not get it. It is about wattage per inference performance.

    This is the problem for the chip makers going forward. The entire world changed. Before chips were sold to the Dells which were sold to the end user.

    Huge disconnect. Now you have companies like Google using all their aggregated data to create chips and that makes it very difficult going forward for independent chip makers, IMO.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like

Biting the hand that feeds IT © 1998–2022