1000 Petaflops = 1 billion ?
I think you'll find that's 1 x 10^18, which is a billion billion FLOPS
As for 2048 bits? this thing will variable length number formats like the ICL DAP?
Does sound impressive though.
ARM is bolting an extra data-crunching engine onto its 64-bit processor architecture to get it ready for Fujitsu's Post-K exascale supercomputer. Specifically, ARM is adding a Scalable Vector Extension (SVE) to its ARMv8-A core architecture. SVE can handle vectors from 128 to 2,048 bits in length. This technology is not an …
Given that in some parts of the world a billion is 10^9 while in others it is 10^12 -- and that in the former parts 1000 Petaflops (10^18 flops) would be a Quintillion flops while in the latter parts it would be a Trillion flops -- methinks it would be better to eschew all of these confusing terms ending in "-illion" and just use the SI terms.
1000 Petaflops is one Exaflop. That's all there is to it.
Intel is going to have improve its customisation game if it wants to stay in this market. Hardware customisations like this are ARM's ace up its sleeve. We can assume the actual chips that Fujitsu (no mug when it comes to chips as the SPARCs show) will have additional whizz-bang stuff baked into hardware but this kind of compiler optimisation is going to give the HPC crowd wet dreams. Between this and FPGA Intel is going to be increasingly squeezed.
Between this and FPGA Intel is going to be increasingly squeezed.
Nah. Intel will just hand a large wad of cash to Microsoft and lo, Windows 11 will have compatibility problems (well, MORE of them) with AMD, as much by accident as DR DOS had problems.
And it'll still take 5 minutes to boot.
"Nah. Intel will just hand a large wad of cash to Microsoft and lo, Windows 11 will have compatibility problems (well, MORE of them) with AMD, as much by accident as DR DOS had problems."
Too bad you can't use the black helicopter/tinfoil hat icon while AC?
Crushing AMD doesn't benefit Intel that much since AMD just caters to the low end (=low profit) on desktop/laptop and the Opteron Server market is pretty much dead already. Besides, Microsoft is in bed with AMD for the next several years because of Xbox. Microsoft has their hand full in trying to keep the PC market alive and having Intel gain monopoly and raising the prices again will not play into their hand.
"And it'll still take 5 minutes to boot."
It's about 10 seconds from cold boot to desktop on my 3 year old Windows laptop. Try harder next time.
> Between this and FPGA Intel is going to be increasingly squeezed.
Well - arguably the Xeon Phi core is a similar architectural concept, and with the purchase of Altera Intel is now in the FPGA game. But one can see how much noise Intel is making about Xeon Phi and their FPGA offerings (advertising "up to 10TFlops" for Stratix 10), so I guess it must be feeling squeezed already!
What will be interesting to see is how the economies of scale offered by ARM's licensing model trade off against Intel's fab leadership advantages.
A Cray-I handled vectors of 64 double-precision reals, each 64 bits long. That is 4,096 bits, so SVE can go up to half of that, which is pretty good. And the Cray did the elements of those vectors one at a time in a pipeline, so breaking the vectors up into shorter pieces is not a failure.
But I don't think SVE will be offered on most ARM chips; instead, it will probably only go into ARM chips intended for use in supercomputers.
Certainly zero reason to put them in phones, but I could see them added to at least some server CPU designs. Just because you can't afford a supercomputer doesn't mean you don't have number crunching needs. If there's ever to be any hope for ARM servers gaining a foothold, to start they'll need to find a few niches where they can be clearly better than the x86 alternative.
Very impressive thinking going on here and it will be really interesting to see what it delivers benchmark wise with the new recompiled applications. It nice to see some non-Intel ideas out there if only to keep them sharpe.
BTW did anyone else enjoy " automagically identifying loops" I don't know or care whether it was a mistake it made me smile.
While you can force vectorization of that code (as you do by using compiler flags), doing so is not, in general, safe. Consider an invocation of vectorize_this in which a and either b or c point into the same array. (E.g. vectorize_this(&b, b, c);). There is now a loop carried dependence and the results generated by the vector code will be different from those generated by the scalar code.
If you know that the code is used without such overlaps, then the right answer is to modify the code and use the "restrict" qualifier on the arguments to inform compiler of that fact. Though then, of course, you can't claim not to have to modify the code!
(FWIW I work for Intel, and this is an issue for everyone...)
This post has been deleted by its author
Biting the hand that feeds IT © 1998–2022