Linux bad scalability
"Like the Altix UV which runs standard SUSE Enterprise Linux.
"and expand from its own variant of SUSE Linux to a machine that can run standard SUSE Linux Enterprise Server or Red Hat Enterprise Linux as well as Microsoft's Windows Server 2008 R2....SGI has been making a lot of noise lately about how Windows Server 2008 can run on its Altix UV 100 and 1000 machines, and in fact, the UltraViolet hardware scales far beyond the limits of that Windows OS at this point. The Windows kernel sees the"
Please, dont talk about Linux scalability. It might even scale worse than Windows. First of all, there are at least, two different kind of scalabilty:
1) Horizontal scalability, scale-out. It is basically a cluster. Just add another node and the number crunching gets faster. These clusters typically have 10.000 of cores or even more. Supercomputers have many more. These clusters are used for HPC number crunching work loads, and can not handle SMP workloads.
2) Vertical scalabilty, scale-up. It is basically a single fat huge server. These huge servers, SMP-alike, typically have 16 or 32 sockets. Some even has 64 sockets. IBM Mainframes have up to 64 sockets. These costs much more than clusters. For instance, the IBM P595 with 32 sockets used for the old TPC-C record, costed $35 million list price. Can you imagine what a cluster with 32 sockets costs? Not $35 million. Probably it will cost 32 x 1 node. And if one node costs $5,000, it will be not be $35 million. These SMP-alike servers, are used for SMP workloads, typically running large databases in large configurations. HPC clusters can not do this (they can run a clustered database though, but not run a normal database).
Enterprise companies are only interested in SMP workloads (large enterprise databases etc). The reason Unix rules in Enterprise companies, is because Unix has huge SMP-alike servers capable of handling SMP workloads. Linux can not, Linux SMP servers dont exist. Linux is only used on HPC clusters, and Enterprise companies are not interested in HPC clusters.
Now regarding Linux scalability: Linux runs excellent on clusters (such as supercomputers), but scales quite bad on SMP alike servers. Linux has severe problems utilizing beyond 8-sockets. First of all, there have never existed any Linux server with 32 sockets, for sale. Recently, a couple of months ago, the Bullion released the first 16 socket Linux server. The first ever in history. And it is dog slow.
There has never ever been a 32-socket Linux server for sale. Never ever. If you know of one, please show us a link. You wont find any such a large SMP-alike server. Sure, people have compiled Linux onto IBM P795 AIX Unix server with 32 sockets - but that is not a Linux server. It is a Unix server. I could compile a C64-dos to it, and it would not make the IBM Unix server a C64. And people have compiled SuSE to HP's Unix Itanium/Integrity 64 socket server - but it is still a Unix server. They tried this before, and never sold Linux on the HP Unix server, google on "Big Tux Linux" for more information and see how bad Linux scalability it had, with ~40% cpu utilization running on 64 sockets. This means every other cpu was idling, under full load. How bad is not that?
Regarding the SGI UV1000 servers, they are clusters with 10.000 of cores. ScaleMP also has a huge Linux cluster with 10.000 of cores. It is a cluster running a software hypervisor, tricking the Linux kernel into believing it is running a SMP server - with bad scalability. Latency to nodes far away makes the cluster uncapable of handling SMP workloads. Latency on a SMP-alike server is very good in comparison, making them possible to run a large database on all cpus, without grinding to a halt.
Thus, Linux servers with 10.000 cores (that is, clusters) are not suitable of handling Enterprise SMP workloads. See yourself. The ScaleMP Linux cluster is only used for HPC number crunching:
"...Since its founding in 2003, ScaleMP has tried a different approach. Instead of using special ASICs and interconnection protocols to lash together multiple server modes together into a SMP shared memory system, ScaleMP cooked up a special software hypervisor layer, called vSMP, that rides atop the x64 processors, memory controllers, and I/O controllers in multiple server nodes....vSMP takes multiple physical servers and – using InfiniBand as a backplane interconnect – makes them look like a giant virtual SMP server with a shared memory space. vSMP has its limits.
The vSMP hypervisor that glues systems together is not for every workload, but on workloads where there is a lot of message passing between server nodes – financial modeling, supercomputing, data analytics, and similar parallel workloads. Shai Fultheim, the company's founder and chief executive officer, says ScaleMP has over 300 customers now. "We focused on HPC as the low-hanging fruit..."
SGI talks about their large Linux clusters with 1000s of cores:
"...The success of Altix systems in the high performance computing market are a very positive sign for both Linux and Itanium. Clearly, the popularity of large processor count Altix systems dispels any notions of whether Linux is a scalable OS for scientific applications. Linux is quite popular for HPC and will continue to remain so in the future,
However, scientific applications (HPC) have very different operating characteristics from commercial applications (SMP). Typically, much of the work in scientific code is done inside loops, whereas commercial applications, such as database or ERP software are far more branch intensive. This makes the memory hierarchy more important, particularly the latency to main memory. Whether Linux can scale well with a SMP workload is an open question. However, there is no doubt that with each passing month, the scalability in such environments will improve. Unfortunately, SGI has no plans to move into this SMP market, at this point in time..."
Ergo, you see that Linux servers with 1000s of cores, are only used for HPC number crunching, and can not handle SMP alike workloads. SGI and ScaleMP says so, themselves.
And also, there has never been a 32 socket SMP-alike Linux server for sale. Until a couple of months ago, there was no 16-socket Linux server either for sale, but Bullion released the first generation Linux SMP-alike 16-socket server. Ever. And it performs very bad, just read the benchmarks.
In comparison, Unix on 16 or 32 or 64 sockets have performed very well for decades. Linux scales well on clusters, but extremely bad on SMP-alike huge Unix servers with up to 64 sockets. The thing is, Linux developers never had access to large SMP servers, so they can not tailor Linux to such work loads. Unix devs had been able to do this for decades. In some decades from now, maybe Linux will be able to handle 32 sockets well, too. But not today. Just read SGI and ScaleMP - all them are used for HPC and avoid SMP workloads - why?