How would an olde worlde MB (or indeed a modern one) supply the low and accurate voltages needed? Or does the chip include a voltage regulator as well?
Intel labs has created a prototype processor that achieves a high level of energy efficiency by running at voltages barely above those those required to light up its transistors. Intel's marketing folks trumpet that the processor, code-named Claremont, can be "run off a solar cell the size of a postage stamp", but at CTO …
A lot of the old P3 MBs have a plug-in (not soldered in) voltage regulator. At least all worksation and small server MBs from Compaq and HP as well as some Asus MBs do.
I am in fact typing that on one - Dual 1GHz P3 which I use as a development platform and an X-term. Its voltage regs IIRC are soldered in, but I have a couple with plug-in regs in storage somewhere.
In any case, there is more than enough space on the P3 CPU board to put a local voltage regulator which takes the 1.75 VCore and feeds the modified core at the voltage required. P3 is also a good point to start off with. All P3s are in the 15-22W thermal design range. So a P3 core (if made at modern silicon tech) should be sub-5Ws to begin with or even less.
As far as dumpster diving... They are making me laugh. I still have a whole bag of working P3 MBs. My DIY NAS is a P3, my workstation is a P3 and I have at least 3-4 working P3s (single and dual CPU) in storage.
To be fair to Intel 2 W was mentioned for a 100 GFLOPs system that currently requires 200 W. If those chips were available I'm sure they would be snapped up. But they're not yet.
The article provides no clear information on the power requirements of Claremont. Certainly impressive improvements but sounds like process work at which Intel excel. In theory the same improvements would work for every architecture including ARM.
"two watts or less in a handheld – a hundred gigaflops in your hand". It looks to me that even dual core ARM @1GHz cannot get you anywhere near the 100GFLOPS:
"As of 2010, the fastest six-core PC processor reaches 109 GFLOPS (Intel Core i7 980 XE) in double precision calculations".
Regarding the 2W goal -- you're right, they should be aiming well below 1W (better below 0.5W) power to be taken seriously at "handheld" market.
Read the article, they're saying the aim is to get 200W worth of present-day computing down to 2W. So divide current power requirements by 100.
The 11" MBA has a 1.6GHz dual core mobile i5 with built-in graphics with a TDP of 17W. Reduce that by 100 and that's basically 170 mW max power consumption, much lower idle. That's easily mobile phone territory, and MASSIVELY more compute power than current mobile phone chips.
I tried... I really did try. But so many multiplies and numbers were thrown about that I just got thoroughly confused by the marketing crap. Here's all the different statements in the article...
"it also has a high dynamic range that allows it to be cranked up to deliver ten times the low-power performance by increasing the voltage." (10X performance improvement?)
"Let's take an example of a hundred-gigaFLOPS system today," he said. "If you want that performance, it will require about 200 watts of power. With [Intel's] extreme-scale technology, we would like the same level of compute performance requiring two watts or less in a handheld – a hundred gigaflops in your hand." (100X Power improvement)
"In today's chips, transistors are operated at several times their threshold voltage. By redesigning a processor to be tolerant to near-threshold voltages, Borkar said, "What the theory tells us is that it will increase the energy efficiency by about 8X or so." (8X energy efficiency improvement)"
"Claremont's efficiency improvement was more at the 5X level. "You might be wondering," Borkar said, "'Why only 5X?', because I said 8X earlier. The reason is that we use an old core here. If we had started this design from scratch, we could have got an 8X or 10X improvement."" (5X, 8X and 0X energy efficiency improvement.)
So that is 5X, 8X, 10X, 100X improvement, not surprising I failed to generate a complete view of the world :-(
It's even better that they are talking about redesigning the Pentium core. It hasn't seen a lot of change since they dropped back to the PIII. Most of the silicon changes since have been dedicated to 64 bit, cache expansion, SSE extensions, memory controller, graphics, and visualization. The core is essentially the same.
IA reimplemented specifically for mips/watt would significantly change computing. A near static design where clock speed matches compute needs would kick arse. No mips demand = near zero power consumption. Instant on is irrelevant if always on consumes only a few milliwatts.
Yawn. They've already done it. The x86-ness costs about 2% of the transistor budget. Even changing to an ISA that required no power at all would not change the power consumption of the chips by a noticeable amount.
ARM chips consume blow-all power because they have blow-all power. Seen any big ARM-based servers recently? Thought not. They'd be running Linux, so the OS isn't the reason. Perhaps it is because ARM licensees do not yet compete in the performance market.
This post has been deleted by its author
One of the reasons why digital logic became popular was the dynamic range of analog signals that had to be handled. In analogue computing terms something like +/- 15v with *everything* between being a valid value.
Analogue controllers might use smaller ranges but the problem remained.
Now look at what is the logical 1 and 0 ranges on modern chips. It's still pretty generous.
Boom times for the makers of decoupling capacitors and single chip voltage regulators.
"One of the reasons why digital logic became popular. "In analogue computing terms something like +/- 15v"
Nope. Analogue logic and computing is horrid.
Analogue has very little dynamic range and is hopeless for most calculation purposes. Analogue multipliers and adders are very prone to drift and all sorts of problems. It is these limitations that cause them to be crap, not the voltage that they are driven with.
Some digital circuitry, such as ECL, has/had positive and negative supplies.
As far as handling noise etc goes, all digital signals are just clipped analogue signals. Digital signals are typically far more robust though.
If you read the article as referring to the original P5 Pentium core, and not the P6 and its derivatives (PPro, PII, PIII, PM,... even the first Core architecture to some extent, from what I understand), the "dumpster diving" comment makes more sense. After all, Intel continues to have "Pentium" processors in their modern lineup.
"Let's take an example of a hundred-gigaFLOPS system today," he said. "If you want that performance, it will require about 200 watts of power."
Well, it might take 200Watts worth of Intel hardware to get 100GFLOPS. But there's plenty of industry examples that already out perform that. Take the Cell processors - that weighed in at about 250GFLOPS for 80Watts, (32Watts / 100GFLOPs). And I wouldn't mind betting that most GPUs that get up to 100GFLOPS (i.e. all of them these days?) use much less than 200Watts. And just how many ARM SOCs do you need to get 100GFLOPs? They seem to deliver *enough* performance on very little juice indeed.
I think this is Intel missing the point again. If you *really* want to deliver a workload with the absolute minimum of power consumption, starting off with the x86 as the basis for delivering it is not necessarily going to be the optimum solution. Intel are very good at forcing silicon manufacturing towards ever more impressive transistor performance, but everyone else catches up sooner or later and just builds ARMs using the same tricks. And ARMs seem to have an inherent architectural advantage when it comes to Performance/Watt metrics.
Where this may just save Intel (at least for a while) is in the world of servers. If they can point to siginificant power savings in the data centre then the operators will be replacing their equipment as quickly as they possibly can.
Er, no. If your processor can only get FLOPs when given an embarrassingly parallel problem, then it isn't interesting.
We've had GPUs and Cells for years and yet they haven't taken over the world. They aren't useful for much beyond eye-candy. Oh, and don't point me to a whole raft of El Reg articles about how someone has programmed an FFT on one of them. The very fact that these stories are still newsworthy, years after the technology arrived on people's desks, tells you that this hasn't caught on.
Well, I guess it depends on what you call an interesting compute problem ;-)
When you stop and look at the high performance floating point compute jobs that your average man on the street actually wants done (and is therefore 'interesting', at least from an industrial competition point of view), it's things like video / audio codecs, and to a lesser extent 3D graphics and games physics. And that's about it. Most people's high performance floating point requirements *are* very parallel indeed. That's why Nvidia and ATI have successfully sold so many billions of very parallelised GPUs, and why almost every smart phone out there has one too. In that sense they really have taken over the world.
My only point really is that whatever Intel/AMD can achieve with a general purpose CPU someone like NVidia, Qualcomm, ARM, etc. is likely to surpass once they've mastered the comparable sillicon manufacturing techniques. That has consistently been the case up to now, and the commercial realities of today are clear evidence of that. And now there's things like CUDA and OpenCL which are threatening to take even more floating point workload away from the CPU.
Until Intel can get the performance / Watt to a level where the x86 battery life is meaninglessly long or the electricity bill insignificant, they're not going to get a look in. Maybe these low operating voltages will get them there, but I doubt it. Anyway, who wants 100GFLOPS in a handheld device anyway?
"My only point really is that whatever Intel/AMD can achieve with a general purpose CPU someone like NVidia, Qualcomm, ARM, etc. is likely to surpass once they've mastered the comparable sillicon manufacturing techniques."
No. Because as soon as these excellent people try to build a processor for non-embarrassingly parallel problems, they end up building something that looks like an x86. That's a single-threaded general prpose processor. A tiny fraction of the die area is spent on the instruction decode. Everything else is ISA independent, determined by the target market segment and frankly designed in much the same across the entire industry.
Biting the hand that feeds IT © 1998–2021