Interesting, but...
I think Ryzen still wins on performance
Chinese tech giant Alibaba claims to have designed the fastest RISC-V processor to date, and reckons it will open source at least some of the blueprints for others to use. The chip was unveiled this week at Alibaba's Cloud Summit in the Middle Kingdom, though details are curiously thin. Word reaches us of the development, …
RISC-V is coming from a standing start just a handful of years ago. CPUs such as the Western Digital SweRV and SiFive U74 are dual-issue in-order processors similar to the ARM Cortex A7 or A55 respectively or roughly like a Pentium MMX or PowerPC 603e but with more MHz (and 64 bit for the U74).
It's only a matter of time before many RISC-V companies have Out-of-Order CPU cores. CloudBear in Russia already announced their BI-671, Esperanto Technologies is going directly to OoO CPUs, the SHAKTI project in India are working on their "I Class". It would be surprising if others are not working on OoO cores as well -- especially those who already have dual-issue in-order working.
The Alibaba CPU is right where you'd expect it to be: pretty similar specs on paper to the ARM Cortex A75.
The performance numbers are .. right around what you'd hope you'd get by going to 3-issue OoO from the existing in-order processors.
Of course this is all nowhere near Ryzen or Skylake or Apple's much more aggressive than ARM's ARM designs. Give CPUs like that maybe five more years to start to appear in RISC-V land.
What's important about RISC-V is not the instruction set or the implementation, but the fact that the instruction set isn't encumbered by intellectual property claims, meaning anyone can produce products based around it.
That doesn't necessarily imply silicon that implements the instruction set directly. You could envisage systems that treat the RISC-V code as an intermediate language and compile it into different instructions for custom silicon. You could also use compiler-like techniques to achieve some level of out-of-order execution without necessarily having a great deal of hardware support - using software to reorder the code in advance. Given that innovation in software is generally cheaper, I'd expect a hybrid approach to be of some interest.
Thanks for your reply.
You're correct that the business model aspects of RISC-V are the important thing, not the technical merits or innovation, however it's definitely worth noting that the technical merits are right in the ballpark with things such as ARM or MIPS or SPARC, and better in some ways.
Yes, RISC-V makes a pretty good intermediate language or neutral software distribution format. It's very easy to emulate or JIT -- even the first working version of RISC-V QEMU immediately ran twice as fast as ARM32 or ARM64 versions of QEMU.
I was co-author of a RISC-V simulator and paper showing that if you concentrated on mapping directly to x86_64 you could get about twice the speed of QEMU, or often only about 20% to 30% slower than optimised x86_64 native code.
https://carrv.github.io/2017/papers/clark-rv8-carrv2017.pdf
Some other people have since picked up on this work and applied it to using RISC-V as a high performance method of implementing smart contracts on the Blockchain.
https://www.youtube.com/watch?v=wxZvX1GmvA4
You'll see my work referenced at around 17m15s.
Reading this reminds me that when, on Tomorrow's World (referred to by our chemistry teacher as "Yesterday's Village"), when Raymond Baxter explained why there would never be a handheld computer he had no idea that not only would they appear, almost everybody would have one, and the computers in them would have enormously more compute power than the biggest mainframe of the 1960s.
Oh yes, we forgot to emphasis that - just assumed everyone was on the same wavelength. RISC-V, as an ISA and community, is still very new compared to incumbents, and today's available silicon is currently up to about Arm Cortex-A50-series performance.
So there's everything to play for. Don't forget: Arm's CEO late last year told a room of journos, including those from El Reg, RISC-V was keeping Arm's engineers and salespeople "on their toes."
C.
>Do the open source processors mean that they will be cheaper to buy?
No.
The cost of buying IP when developing an ASIC are as nothing compared to the rest of the NRE costs of going to full silicon production. This is the big fallacy that needs to be burst: Silicon IP is not software. Just saying "Open Source" won't magically give you free IP and all the support you may need to actually use it.
We've had open source RISC cores for donkey's years. If it were all so simple then everyone would already be using LEON.
Might have to do with its origins and purpose. LEON is currently 32-bit-only (while most computing tasks will need at least 64 bits going forward to address >4GB and so on), its licensing is not as free, and most development on it has been focused on electrical hardening (adverse environment handling), as LEON was developed for the ESA for use in satellites and such. Also, LEON is based on SPARC. RISC-V is more general-purpose, is designed for custom extensions, and already has a 64-bit path set.
Yes, keep in mind also, that with a MMU (which anything powerful enough to need to access over 4GiB of memory in the first place will have by default these days anyway), the limit is really 4GiB /per process/.
The advantages of 64-bit in both x86 and ARM for day-to-day usage are much more from AMD and ARM taking advantage of the shift to also revise their instruction sets (including many more registers, as well as more compiler-friendly instructions) than from the huge flat memory space.
Having said that, 64-bit - even at the day-to-day compute level - can make some things easier (eg, memory mapping large storage volumes), so it isn't to call 64-bit 'unnecessary", of course. Just have to balance that ease against all those extra transistors needed for the wider registers, pathways, and so forth.
*For my applications* I am very interested in 32-bit RISC cores and can't see myself needing more than one 64-bit core for certain specialised tasks. Ideally, some sort of wide-narrow multicore setup (in deference to ARM's BIGlittle configurations) with a large cluster of RV-32 general cores, plus the mentioned RV-64 and a few other task-specific cores that can be bought up on an as-needed basis.
Perhaps OP should have said:
'But we do have "free as in freedom" food and water everywhere but the "Land of the free"'
I don't think Americans want freedom these days, they're too busy defending their right to shoot one another and receive tips for poor customer service to worry about the little things like freedom...
"The tips are supposed to supplement that poor hourly rate."
If that were the case, surely the staff would make an effort to provide adequate service in order to earn those much needed tips. As it stands, service is consistently terrible in the US because tipping is the norm and expected. Perhaps if so much money wasn't being spent on Monsanto's special seed there would be a few shills left over for the waiting staff too. If only those staff had some way to control their government and make it happen*...
Yes, I know everyone gets a vote in the US, but votes don't really make a difference in such a corrupt system unless they are submitted with a truck full of money.
Yes, because in the Land of Freedom (from ethics) they still think that a decent hourly rate is something communist... so they sent most of their manufacturing to a communist country to ensure they can keep on paying very poor hourly rates.
If they don't receive enough in tips to bring them to at least the standard minimum wage the employer is required to pay them the difference. Simply put if they make no tips they are still working for the same minimum wage as everyone else. Of course there are several states which don't allow tip credits and have the same minimum wage for all and service typically suffers. For good service personnel working in the right places six figure incomes are within reach.
" he used some of his own seed each year for the next planting."
Not _quite_ as cut'n'dried as that. The farmer in question went out of his way to cultivate the Monsanto stuff by blasting what grew with roundup and only harvesting seed from what survived.
If he hadn't done that, Monsanto wouldn't have won.
Their more more recent varieties are sterile, so you can't cultivate them from last year's seed.
"Farmers can grow their own crops"
Being a farmer is not free as in freedom. All the land worth having is owned by somebody else. That's why, historically, so many millions of starving rural inhabitants constantly pour into the big cities looking for a chance to stay alive. And many farm owners don't allow their employees to eat the produce anyway, as they can get better prices selling it to us lot - the labourer can sure grow the crop but they get sacked if they eat it.
"All the land worth having is owned by somebody else."
In the UK that happened because the newly minted business wealthy realised they could have any laws they wanted if they could get themselves into parliament and promptly did just that.
The enclosure of the commons was the result - they passed laws awarding themselves ownership of common land.
"That's why, historically, so many millions of starving rural inhabitants constantly pour into the big cities looking for a chance to stay alive"
Which played perfectly into the hands of those same people by ensuring an abundant supply of dirt cheap labour for their factories. For most people the choice was "starve in the countryside or starve in the cities.
And don't run afoul of government edicts.
Could we be seeing the beginning of a new trend - fast but insecure OOO models if that is your thing, alongside secure but slower in-order models for the more cautious? Open Source is uniquely positioned to stop proprietors fscking around with the choices you can enjoy.
I rather like the idea of packing both onto one chip and handing stuff off to the fast baby only when deemed non-critical.
Note: most of the recent security issues recently (Spectre, et al) are down to _speculative execution_, which is different to out-of-order execution. In OoO, the processor is free to re-order instructions which do not share a data dependency. If it needs to compute x = a + b and y = c + d, then it can do either one first. So if c and d are in registers, say, while b needs to wait for the result of a previous computation, the cpu can compute y first and x later, even if the code provided by the compiler has x first, then y.
For speculative execution, the cpu will execute instructions which may or may not actually be required. If the code it's running contains something like 'q = p * r; if ( q < 17 ){ w = *k + 2}' then speculative execution allows the cpu to guess that q will be less than 17 and push on with computing w. It might do this if the calculation of 'q' is taking a long time - perhaps 'r' is the result of a previous computation, or 'p' needs to be loaded from RAM (rather than cache). If, when 'q' is known, it turns out to be 19, then the computation results for w are thrown away. Part of the whole meltdown problem was that the memory access *k by the speculative execution logic was done without appropriate permissions checks, and even though the result of the access was not visible to the program, side-channels (like timing or cache population) _were_ visible and able to be exploited.
At least, that's how I remember it, but it's been years since I've studied computer architecture stuff.
"While RISC-V has strong support for instruction set extensions, 50 seems a bit much"
50 does seem quite a lot but while Risc-V has got the "user mode" instructions well covered, it is extremely thin for priveleged operations and (at least when I was involved in a R5 design 6 at end of last year) has nothing to support OSes when they do things that need to flush pages from the cache when memory translations are changed and flish the translations from any levels of TLB. This could end up being a weakness in RiscV as it seems possible that everyone will add their own extensions so everyone will have to maintain their own set of patches to Linux etc to support their particular version of cache management.
I'm sorry but that's simply wrong. Look at the SFENCE.VMA instruction described on p114 of "The RISC-V Reader" or p56 of the reference manual:
https://content.riscv.org/wp-content/uploads/2017/05/riscv-privileged-v1.10.pdf
rs1 optionally specifies the VM page for which the mapping has been changed, and rs2 optionally specifies the address space in which the mapping has been changed. If neither of those is specified (i.e. is set to register x0) then the entire TLB needs to be flushed, but fine grained control is also possible.
Yes, SFENCE.VMA does impose a fence on memory operations but (in an architecture that says its RISC) should a single instruction be expected to flush or invalidate data from (multiple levels of) caches? But there seemed to be plenty of people on the R5 discussion groups wanting to specify cache management instructions .
The opcode is defined and programs (operating systems) can use it.
It's up to the chip designer whether the instruction is implemented in (as you point out) somewhat complex hardware that does everything OR traps to machine mode where a subroutine of normal instructions might have various logic, loops etc, that manipulate the TLB and/or caches of that particular core by reading and writing CSRs or possibly using some simpler custom instructions.
It's good for CPU designers to have a choice of how they do it, to cover a wide range of design points but with the exact same OS code running on all.
Similarly, the RISC-V architecture specifies the format of page tables in memory, but says NOTHING about what TLB hardware you might have, or whether TLB misses are handled with hardware that walks the page table or by a trap to Machine mode to do page table walking and TLB reload in software, or what.
Meh! and Nonchalant double shrug....
64-bit CPU's are soooooo passe.... Our parent Aerospace company went with a combined-design custom instruction set CPU/GPU/DSP at 128-bits wide on GaAs and GaN substrates. Works a LOT FASTER at 60 GHz and the separate and very specialized 128-bits wide Vector/Array processor runs at an even higher 2 THz!
I would suggest that up-sized 60 GHz and 2 THz clock speed and multi-vectored instruction set is probably WHY that computer is 595 times more powerful in ExaFLOPS (119 Sustained!) than the Summit Supercomputer at Oak Ridge labs (200 PetaFLOPS)!
.
Next Up! Fully Optical Computing into the Petahertz Range (i.e. UV from 3 to 30 PHz!) Lets see how much vacuum can we make here on Earth so that UV doesn't get absorbed during transmission on an Opto-CPU!
.
To pop this up a couple of levels for people who don't want to dig deep...
ONE VERSION of Linux runs on all RISC-V hardware. Hardware-specific patches are NOT needed.
All packaged Linux (etc) distributions can assume RV64GC. That is, 64 bit hardware including the extensions for multiply/divide, atomic transactions, single and double precision floating point, and variable length 16/32 bit instructions.
In any particular machine maintenance of caches or TLBs (for example) is either provided directly in the hardware, or else it is the responsibility of the hardware vendor to to provide Machine Mode software that traps and emulates the required functionality. This Machine Mode software must be installed by the boot process before the Linux kernel is invoked. As far as the Supervisor Mode software (e.g. the Linux kernel) is concerned everything Just Works.
RISC-V, the toolchain and ecosystem comes from Hennessy and Patterson's life's work at Berkeley. So isn't it kinda ironic that the world is actually still reliant on US origin tech. There's a lot of folks that put a lot of money, time and effort into research for these processors. I'm glad there is an open source option here. When you really think everything should be free, ask yourself if you would work for free? Meanwhile, the intention of RISC-V is to democratize processor development and allow the potential for more efficient fit-for-purpose highly efficient processors for cheap. I wonder if non-democratic states will respect that intent, or start making highly efficient missile control systems quickly and cheaply.
The place of origin matters less than you'd think.
Examples:
The number of nations working on the Manhattan project
The origin of the Jet engine
Von Braun's contribution to Apollo
Bletchley Park improved upon Polish work
SSEM Stored Program Computer in Manchester
Linux
Wind the clock back further and we are all just apes learning to hit each other with sticks.
"I wonder if non-democratic states will respect that intent, or start making highly efficient missile control systems quickly and cheaply."
Being able to build an efficient anti-missile system will help those countries to avoid being destroyed by Freedom Bombs from "democratic" states.
In case you are under some kind of delusion, "democratic" states or entities in "democratic" states have no obligations to and can just as easily not respect the "intent to democratize processor development".
isn't necessarily what happened.
The lawsuit Monstanto brought against the farmer in question was for using Roundup Ready, a Monsanto weedkiller, on the crop he grew with the grain which supposedly blew into his field from a neighbour's field. The combination of the GM grain and its matching weedkiller was covered by Monsanto's patents at the time. The farmer agreed not to do this any more, Monsanto withdrew its lawsuit.
Monsanto's GM grains are a big financial advantage for farmers when combined with Monsanto's Roundup Ready glyphosate weedkiller. Going non-GM means less tonnage of grain produced per acre which means less money. The patents on the original GM seed strains has long expired and it is now not a breach of patent and licencing to use RR with those seed strains. The problem is that "saved" seed from a previous harvest is not as vigorous and productive as properly bred hybrid seed grown for and sold to farmers by Monsanto and others. The USDA will not provide crop insurance for saved seed plantings since they're more prone to disease and crop failure unlike licenced hybrids.
Monsanto's new licenced hybrids are covered by new patents though...
Given it seems to take less than 5 years for sufficient weeds to develop resistance to roundup and defeat any benefit I doubt the farmers will be better off in the long run or produce more food. I've got weeds in my gravel paths that are now resistant to all but NaClO4 which you cant buy any more but fortunately I've got some left over.
" I've got weeds in my gravel paths that are now resistant to all but NaClO4"
Granted it only takes care of the bits above ground(*) but boiling water (or, better - a shot of steam) works pretty effectively.
(*)Yes, the weed comes back, but it takes a fair amount of stored energy from the root to do it and repeated applications mean it's busy expending everything it's got trying to grow something instead of setting seed - The downside is that this is manually (and energy) expensive.
My expectations are that this will rival ARM A72 (in integer performance) while that's the five years old design RISC V is a young lad growing by the enormous spread. I imagine if the designs remain to be kept open in two years we will have 6 instructions wide OoO designs on the plate and in one more year more superior RAM & predictors than ARM will be able to accomplish. However re examening fundamentals of RISC V architecture ISA I think OoO designs are simply the wrong way to go. ISA is very scalable, has laid down paths for scaling up to 256 bit's (128 already finalised). It's easier to scale up ISA (along with compiler for it) and more beneficial than going OoO as branch predictor for it is costly, bulky and seeds of all evel (speculative exploits). In favour of this claim just take a look at A72 vs A55 (A72 is almost 4x the size and only 78% faster). Of course I don't mean we need 128 bit core's (just ISA) as 64 bit is more than enough, just putting more of in order 64 bit blocks tieing them up together with fast micro switches (ASICS) & adding minimal predictor (instruction circles based decision) would go up much better in both performance per mm and performance per W metrics. Besides fast adoption of wider ISA is also crucial for incorporating wider vector and SIMD extension blocks that demand to be well feed. This is just my penny about hardware architecture. Best regards.
"The XT 910's architecture is good for producing micrcontrollers"
Maybe I am old fashioned but a 64 bit super scalar processor with speculative and out of order execution and support for multiple cores does not seem remotely appropriate for a microcontroller. The power consumption and cost would be prohibitive and the performance far more than that required for a microcontroller.
If I understood correctly it is a processor with small cores and heavy reliance on multi-threading. Are out of order executions really needed in such context? I guess that even without implementing them the amount of unused silicon at runtime shouldn't be so big.