Xilinx & ARM
Zinq is now practically ancient.
You're late to the party Intel, BTW your Igor may wish to visit the crypt.
Intel has expanded its chip customization business to help it take on the hazy threat posed by some of the world's biggest clouds adopting low-power ARM processors. The chip company announced on Wednesday at GigaOm Structure in San Francisco that it is preparing to sell a Xeon E5-FPGA hybrid chip to some of its largest …
Zinq is now practically ancient.
You're late to the party Intel, BTW your Igor may wish to visit the crypt.
Xilinx Zinq and Altera SocFPGA both use ARM multicores (I have a dual core on my delsk here). Others pave other hard cores too.
The hard part is getting the FPGA part going. For that Intel would have had to buddy up with an FPGA partner.
I predict nobody will tough these things. Intel have a terrible track record of developing then dropping fringe parts - leaving the board designers high and dry. Where's my 80251, i960, ...? Gone. No doubt Intel will soon dump these parts too. As a result, the embedded communit is increasingly shy about designing Intel parts into a system.
[I see Mage got there while I was typing. Well it's not a race is it]
Boring.
Z is for Zynq.
Zynq is from Xilinx. A nice ARM and a nice FPGA, one socket, even one chip. See e.g. http://www.xilinx.com/products/boards_kits/zynq-7000.htm (dev boards, prices from under $200 qty 1)
See also ARM's Cortex-M1, the first ARM designed for implementation on an FPGA (even a now-little one like a Xilinx Spartan 3)
http://www.arm.com/products/processors/cortex-m/cortex-m1.php
Other FPGA vendors are available (and almost certainly have ARM support too).
Still, for anyone who still cares about Windows, the Intel/x86 offering might be attractive, for now.
No, not a race, all a bit of fun.
I think dual ARM cores a couple of years ago. I only have a bunch of Spartan 3 boards. Adding an External ARM or PIC is better on those as soft CPU eats too much. What I'd like would be 100MHz 14bit ADC and a couple of 100MHz 16 bit DACs built in. And 40hrs battery life :-)
Ah! brings back the nightmares. I was working on those DRC cards back in '06. They were a pig to use and the socket form factor was a major limit as there was little room for local cache for the FPGA.
The biggest drawback though, and I suspect exactly the same will be true of the Intel offering is that HyperTransport and QPI are not hot-swappable which means rebooting the OS to change the FPGA config. Not very good if you want agile workloads.
The same is true of PCIe of course, but at least both "X" and "A" claim to have a working partial reconfiguration solution for PCIe. An FAE from one of those two admitted to me recently to their knowledge they had two whole customers actively using PR on their latest generation of FPGA though.
I was hoping in this announcement for something a little more sophisticated than an internal QPI interconnect to the FPGA. The penalty for the software thread is still worse than a L3 cache miss!
(Oh, and Altera had FPSlic and APEX20K with an ARM 7TDMI 'stripe' and Xilinx had Virtex2 PRO with multicore PowerPC 405s all around the turn of the century. Nothing new here :)
In 06 we just brought up the system. The first part could talk to the Opterons sockets attached memory (128 bit wide). I also go the FPGA to reconfigure without having to reboot the machine (look for patent 8,145,894). The second generation we used the whole keepout of the Opteron fan and had 6 attached memories. I work with the Altera OpenCL group and we configure over PCIe directly into the FPGA. If you knew about FPGA and reconfigurable computing history you'd know what's going on now. Altera has OpenCL which works great. I use it everyday. The New part is the gen 10 stuff from Altera now supports hardened floating point. The parts are huge too. Compared to the time frame you are talking about the devices are 50 to 100 times better (area * frequency improvement). FPGAs clearly beat CPUs on most tasks and with hardened float point I don't think GPUs have a chance. I personally designed the DRC module and I brought it up too so I know what I'm talking about.
I'm hoping that Intel makes a prettier version of this, and integrates it more closely with the microprocessor logic, so that in addition to using it to add new functionality to the x86 chip, one could also use it instead to, say, decode an alternative ISA into micro-ops.
So the chip could be faster than Hercules, it could save Unisys from agonizing over the fate of ClearPath Dorado, and so on and so forth.
Actually, since alternative architectures might include some functionality that the x86 architecture still doesn't offer, it would be nice if the chip had enough goodness to offer both.
Oh, and the FPGA should not depend on fusible links, but should instead be based on RAM, with saving and restoring it being relatively convenient... so that operating systems could support allowing multiple applications, all running concurrently, to each reprogram the FPGA in its own way. With some restrictions, such as apply to the GPU on one's video card, to acknowledge the fact that it can't be saved and restored as fast as the registers.
"the FPGA should not depend on fusible links, but should instead be based on RAM"
With the greatest respect: when did you last look at volume market FPGAs ? Hint: they can be RAM based. The RAM is loaded either from an external device or from on-FPGA non-volatile (flash) memory.
having both a Xeon and FGPA in the same socket is very interesting. There are a number of biology applications that could use some bit twiddling. But for that matter so could some biophysics to offload some of the messy logic to an FPGA pipeline.
Let's see what the data speeds are first...
P.
This post has been deleted by its author
Ahh, we may well have met at the SC meeting then..... I wrote an internal review paper on your product ;-)
I was thinking that perhaps the best thing about having an FPGA in a socket , is that a custom lightweight communication protocol can be implemented...? Perhaps 100 ns zero byte latency?
3Tflops, eh? That looks interesting, if we can get perhaps 4 of them to talk to one another, using really tight coupling...
P.
Adapteva is busy fulfilling its Parallella pre-order backlog. Zynq 7010 (for the most part) combining dual ARM A9 and FPGA, also coupled with their 16-core Epiphany chips (reg link here). Looks to be a pretty well-balanced system and consumes minimal amounts of power (relative to XEON, naturally).
I'm not sure what the combination of Xeon + FPGA is supposed to achieve, but that's mainly because I don't understand exactly what Intel intends users to offload to the FPGA when they've already got super-beefy cores in the Xeon part. Maybe they're targeting some sort of FPGA-driven interconnect fabric? Still, wouldn't XEON + ASIC be a much better pairing for that particular niche/application?
Otherwise, I just don't know. Customers might "dig" the reconfigurable bit, but FPGA just strikes me as being more of a stop-gap measure until the "real" peripherals can be built... maybe Intel just wants their users to do some R&D for them on the cheap.
How about traffic routing/switching between virtual systems?
I've no idea how practical that would be but it seems like a good application for data going between Xeon processes. How about a load-balancer in/for your virtual system?
I'm not sure this is aimed at ARM. Looks more like network processors or other appliances to me.
Intel really doesn't want anything that isn't x86. But if it thinks ARMs are dangerous, FPGAs are lethal for that model. Which is why they will never sell a separate FPGA part.
I can see a demand for CPU(x86) + CPU(ARM) + GPU + FPGA for scientific work and super computers but you're going to want to be able to control the distribution of those units. I think AMD is approaching this better with the options of additional embedded cores or cards.
While it seems that ChipZilla is reasonable at getting speedy CPU chips out the door, why not have a little variety in the overall chip offerings? One of these days, we will all find out that there is a DEEP flaw and almost be powerless to do anything about it. To be sure every chips gets tested, but these things DO happen (Floating point divide flaw, etc..). One of these days the chip USERS will understand that it is probably a good idea to not put all the eggs in a single (ChipZilla) basket.
Who knows. There may be a problem with little endian architecture that has escaped up (I can only hope!).