You certainly didn't have to write Occam to use a Meiko machine. We had C and Fortran compilers for the Transputers... (I know for sure, as I wrote the code generator for the Fortran compiler [in BCPL, which we also had, of course]).
Posts by UK Jim
11 posts • joined 8 Mar 2012
Clustered Pi Picos made to run original Transputer code
For every disastrous rebrand, there is an IT person trying to steer away from the precipice
Uncle Sam passes comms act that sets aside $750m for the development of OpenRAN
Nvidia unveils $59 Nvidia Jetson Nano 2GB mini AI board, machine learning that slashes vid-chat data by 90%, and new super for Britain
Re: Wrong!
TBH I don't care how much AI power it has. There is a well defined measure of High Performance Computing performance (which we hate, but there it is), and if you claim to be "Britain's most powerful publicly known supercomputer", and then quote a position on the Top500 and a Linpack performance number then you are explicitly using that definition. (Which, here, does not support the claim of being the UK's top machine).
If you want to compare "AI power", then that's fine, you;'re very welcome to do that, but whatever you claim there is not comparable with any measure used to evaluate and rate the top supercomputers.
If there was an "AI500" list and recognised AI benchmark to use to rate machines then claiming a position in that is fine. But the claim that performance on FP16 (or BFP16) is comparable with double-precision Linpack is just wrong.
Or, if you prefer: you are absolutely right, NVIDIA is making an apples to tangerines comparison, and that is what I am objecting to!

Wrong!
"The next SuperPOD project is the Cambridge-1 behemoth, planned to be Britain's most powerful publicly known supercomputer"
Umm, no:
"ARCHER2 will have a peak performance estimated at 28 x 10**15 FLOP/s" (https://epsrc.ukri.org/blog/supercomputers-how-archer2-will-increase-the-pace-and-productivity-of-research/#:~:text=ARCHER2%20will%20have%20a%20peak,than%20the%201964%20Cray%20supercomputer.)
vs "eight petaflops of Linpack benchmark performance" (your article).
So Archer 2 is 3.5x more powerful on a the disliked, but standard, measure of HPC machine performance. (And, Archer 2 is "publicly known")
Verity Stob is 'Disgusted of HG Wells': Time, gentlemen, please
Swiss super pushes USA off podium in new Top500 Supers list
Titan History
"and in the process bump the “Titan” machine at the USA's Oak Ridge National Laboratory off the podium for just the third time in the history of the TOP500 list of the world's mightiest supercomputers."
Really!? This is the third time that Titan has dropped out of the top 3?
I think what you were trying to say is "and in the process made this only the third time in the history of the list when the USA has no machine in the top 3".
Little ARMs pump 2,048-bit muscles in training for Fujitsu's Post-K exascale mega-brain
Auto vectorization...
While you can force vectorization of that code (as you do by using compiler flags), doing so is not, in general, safe. Consider an invocation of vectorize_this in which a and either b or c point into the same array. (E.g. vectorize_this(&b[1], b, c);). There is now a loop carried dependence and the results generated by the vector code will be different from those generated by the scalar code.
If you know that the code is used without such overlaps, then the right answer is to modify the code and use the "restrict" qualifier on the arguments to inform compiler of that fact. Though then, of course, you can't claim not to have to modify the code!
(FWIW I work for Intel, and this is an issue for everyone...)
Picking apart the circuits in the ARM1 – the ancestor of your smartphone's brain
Company you never heard of builds 3.4 petaflops super for DOE
Re: since when
Intel acquired the Infiniband assets of Qlogic about a year ago...
http://newsroom.intel.com/community/intel_newsroom/blog/2012/01/23/intel-takes-key-step-in-accelerating-high-performance-computing-with-infiniband-acquisition
(FWIW I work for Intel, but I do not speak for them :-))