* Posts by Kebabbert

808 publicly visible posts • joined 22 Jul 2009


My other supercomputer is a Lenovo: What IBM System x sale means for HPC


IBM Blue Gene powerful!

Yeah the first Blue Gene HPCs were mighty powerful and fast, it had ranking 7 at the top500. And what cpus did it sport? Well, dual core PowerPCs at 750MHz. Not the fast POWER cpus, but the PowerPC - the same as used for embedded systems and other low power configurations. I dont know if Blue Gene has been upgraded, to 1.2GHz cpus today?

My point is, in the HPC arena, what is important is performance/watt. You need low wattage hardware. And the main problem is to distribute the workload to the cpus, so it does not really matter if the cpus are lesser. You dont need the fastest cpu, but you need the fastest interconnect and low power cpus. I heard that the power bill for a HPC supercomputer can be 10 million USD, annually.

Just because a cpu is used in the Top500 list, it only says it has good performance/watt. It says nothing how powerful they are.

Hazelcast signs Java speed king to its in-memory data-grid crew


"Are you sure you have got that right ? nano-second access across hundreds of GBs of memory isn't doable in hardware, let alone through a JVM sat on the end of a network connection."

Why isnt it doable, can you explain a bit more?

Torvalds: Linux devs may 'cry into our lonely beers' at Christmas


Linux bad scalability

"Like the Altix UV which runs standard SUSE Enterprise Linux.

Ref :http://www.theregister.co.uk/2011/06/15/sgi_altix_sales_hadoop_prefabs/

"and expand from its own variant of SUSE Linux to a machine that can run standard SUSE Linux Enterprise Server or Red Hat Enterprise Linux as well as Microsoft's Windows Server 2008 R2....SGI has been making a lot of noise lately about how Windows Server 2008 can run on its Altix UV 100 and 1000 machines, and in fact, the UltraViolet hardware scales far beyond the limits of that Windows OS at this point. The Windows kernel sees the"

Please, dont talk about Linux scalability. It might even scale worse than Windows. First of all, there are at least, two different kind of scalabilty:

1) Horizontal scalability, scale-out. It is basically a cluster. Just add another node and the number crunching gets faster. These clusters typically have 10.000 of cores or even more. Supercomputers have many more. These clusters are used for HPC number crunching work loads, and can not handle SMP workloads.

2) Vertical scalabilty, scale-up. It is basically a single fat huge server. These huge servers, SMP-alike, typically have 16 or 32 sockets. Some even has 64 sockets. IBM Mainframes have up to 64 sockets. These costs much more than clusters. For instance, the IBM P595 with 32 sockets used for the old TPC-C record, costed $35 million list price. Can you imagine what a cluster with 32 sockets costs? Not $35 million. Probably it will cost 32 x 1 node. And if one node costs $5,000, it will be not be $35 million. These SMP-alike servers, are used for SMP workloads, typically running large databases in large configurations. HPC clusters can not do this (they can run a clustered database though, but not run a normal database).

Enterprise companies are only interested in SMP workloads (large enterprise databases etc). The reason Unix rules in Enterprise companies, is because Unix has huge SMP-alike servers capable of handling SMP workloads. Linux can not, Linux SMP servers dont exist. Linux is only used on HPC clusters, and Enterprise companies are not interested in HPC clusters.

Now regarding Linux scalability: Linux runs excellent on clusters (such as supercomputers), but scales quite bad on SMP alike servers. Linux has severe problems utilizing beyond 8-sockets. First of all, there have never existed any Linux server with 32 sockets, for sale. Recently, a couple of months ago, the Bullion released the first 16 socket Linux server. The first ever in history. And it is dog slow.

There has never ever been a 32-socket Linux server for sale. Never ever. If you know of one, please show us a link. You wont find any such a large SMP-alike server. Sure, people have compiled Linux onto IBM P795 AIX Unix server with 32 sockets - but that is not a Linux server. It is a Unix server. I could compile a C64-dos to it, and it would not make the IBM Unix server a C64. And people have compiled SuSE to HP's Unix Itanium/Integrity 64 socket server - but it is still a Unix server. They tried this before, and never sold Linux on the HP Unix server, google on "Big Tux Linux" for more information and see how bad Linux scalability it had, with ~40% cpu utilization running on 64 sockets. This means every other cpu was idling, under full load. How bad is not that?

Regarding the SGI UV1000 servers, they are clusters with 10.000 of cores. ScaleMP also has a huge Linux cluster with 10.000 of cores. It is a cluster running a software hypervisor, tricking the Linux kernel into believing it is running a SMP server - with bad scalability. Latency to nodes far away makes the cluster uncapable of handling SMP workloads. Latency on a SMP-alike server is very good in comparison, making them possible to run a large database on all cpus, without grinding to a halt.

Thus, Linux servers with 10.000 cores (that is, clusters) are not suitable of handling Enterprise SMP workloads. See yourself. The ScaleMP Linux cluster is only used for HPC number crunching:


"...Since its founding in 2003, ScaleMP has tried a different approach. Instead of using special ASICs and interconnection protocols to lash together multiple server modes together into a SMP shared memory system, ScaleMP cooked up a special software hypervisor layer, called vSMP, that rides atop the x64 processors, memory controllers, and I/O controllers in multiple server nodes....vSMP takes multiple physical servers and – using InfiniBand as a backplane interconnect – makes them look like a giant virtual SMP server with a shared memory space. vSMP has its limits.


The vSMP hypervisor that glues systems together is not for every workload, but on workloads where there is a lot of message passing between server nodes – financial modeling, supercomputing, data analytics, and similar parallel workloads. Shai Fultheim, the company's founder and chief executive officer, says ScaleMP has over 300 customers now. "We focused on HPC as the low-hanging fruit..."

SGI talks about their large Linux clusters with 1000s of cores:


"...The success of Altix systems in the high performance computing market are a very positive sign for both Linux and Itanium. Clearly, the popularity of large processor count Altix systems dispels any notions of whether Linux is a scalable OS for scientific applications. Linux is quite popular for HPC and will continue to remain so in the future,


However, scientific applications (HPC) have very different operating characteristics from commercial applications (SMP). Typically, much of the work in scientific code is done inside loops, whereas commercial applications, such as database or ERP software are far more branch intensive. This makes the memory hierarchy more important, particularly the latency to main memory. Whether Linux can scale well with a SMP workload is an open question. However, there is no doubt that with each passing month, the scalability in such environments will improve. Unfortunately, SGI has no plans to move into this SMP market, at this point in time..."

Ergo, you see that Linux servers with 1000s of cores, are only used for HPC number crunching, and can not handle SMP alike workloads. SGI and ScaleMP says so, themselves.

And also, there has never been a 32 socket SMP-alike Linux server for sale. Until a couple of months ago, there was no 16-socket Linux server either for sale, but Bullion released the first generation Linux SMP-alike 16-socket server. Ever. And it performs very bad, just read the benchmarks.

In comparison, Unix on 16 or 32 or 64 sockets have performed very well for decades. Linux scales well on clusters, but extremely bad on SMP-alike huge Unix servers with up to 64 sockets. The thing is, Linux developers never had access to large SMP servers, so they can not tailor Linux to such work loads. Unix devs had been able to do this for decades. In some decades from now, maybe Linux will be able to handle 32 sockets well, too. But not today. Just read SGI and ScaleMP - all them are used for HPC and avoid SMP workloads - why?

IBM flashy January announcement: Wanna know what's in it?


Re: A long time in the planning?

The reason IBM sold the hard disk business is this:

Back then, hard disks increased storage capacity at a very high pace. So, if you extrapolate the development, IBM drew the conclusion that in a few years, there will be 500GB disks which could store everything you could imagine. So it would suffice with one single disk to store everything a small team could produce. So you would not need to buy as many drives anymore. One disk would suffice. And this is true today; all documents, source code, etc a small team can produce will easily fit into a 2TB disk today. Even a larger team, or maybe, the entire Linux source code with all revisions back to the very beginning, would fit into a 2TB disk. So, why would IBM keep manufacturing hard disks? They would be too good and large in a couple of years. Better sell the hard disk division while IBM could get a high price for it. That is the reason.

IBM manegement reasoned uncorrectly. They reasoned something like this: future cpus will be much faster than today, so one cpu will suffice for many people. Better sell off the cpu division now.

The fallacy is this: the more cpu power one gets, the more demanding applications they run. In the same vein, the more storage one gets, the more data they run. No matter how much cpu or storage you get, you will always find a use for it.

"Intel giveth, Microsoft taketh."

Sega’s Out Run: Even better than the wheel thing


Play it again here:

There is the MAME emulator, that emulates arcade machines on the net and MAME is free and open source. People have dumped arcade ROMs to files and put them on the net. If you look around a bit, you will find the Outrun ROM. So you can play it at home, again. And yes, the ROM is identical to the arcade machine, which means your experience at home will be identical - no differences at all. Graphics, music, everything is 100% identical to the arcade. So, please enjoy Outrun again! :)

PS. People have dumped all sorts of ROMs, so you can also find Space Harrier, Donkey Kong, etc - all these ROMs are directly copied from the arcade, which means your game experience with MAME will be 100% identical.

Sony's new PlayStation 4 and open source FreeBSD: The TRUTH


Re: And this is news? @Kebabbert

"...As for the conspirationist theories..."

It is a fact that these large companies simultaneously, all together, bet on immature Linux with a bad license, instead of mature FreeBSD with a more suitable license. It is a fact that large companies together have bet on one company or technology instead of something superior. There are no theories involved here.

It is also a fact that only a few companies control the global economy - there is much research and PhD dissertations on this. These companies are typically Wall Street investment banks, such as Goldman Sachs, JP Morgan, etc, and they all cooperate tightly - no theories involved here, there are lot of credible research on this, just read the research papers in my link. If these companies decide to bet on something, for instance shorting more silver than is annually produced, then silver prices will plummet (this has happened, and if you work in finance you know this). Or if they decide to go long on a company, stock prices will rise. These are facts.

Here comes the only theory in my post, again: "It is my theory, that if these companies bet heavily on Linux, then Linux will take off. So, it might happen that Oracle and IBM and MS and HP and everyone else will bet heavily on Linux and start to buy and sell Linux". This is the only theory in my post. And I agree that it might be considered as a "conspiracy theory". The rest of the contents in my post are facts. Just read the research papers if you think it is a bit far fetched. I will recap here, what conclusion one PhD dissertation arrived at in my link:

-The researcher used mathematical models to analyze lot of financial databases containing lot of information. In particular, he analyzed which company owned a stake in another company, and which company owned another company which owned another company, etc. And he kept track of all this, across millions of companies. And it turned out that always, only a few companies owned a company, which owned another company, etc. It turned out that 50ish companies where the spider in the net, they controlled every other company. And these 47 companies where typically wall street investment banks: Goldman Sachs, Barclays, JP Morgan, etc. Just read the new and ground breaking research in my link. It is very interesting when researchers use mathematical models in other areas, such as in economics. No human can do this, but math and computers can. And we can observe things that has not been possible earlier.


Re: And this is news?

"...Why did IBM, Oracle, Google, Facebook and others went with Linux..."

Yeah, why? Back then, they all suddenly jumped shipped and bet heavily on Linux, that was very immature back then. Linux supported 2-4 cores and not much more. FreeBSD was already mature and stable back then. And still every large company jumped ship and went Linux, with a much more constrained license than Linux. FreeBSD license allows anyone to even close the source, while using all code. GPL Linux forces you to open the code, you can not close it. Obviously, a competitive company (all of them are) will prefer propietary stuff will prefer FreeBSD license - and still everyone, suddenly chose Linux which was immature and had a bad license for monopolistic greedy companies. They all chose Linux at the same time. Why?

I worked at a big fortune 500 company recently and they suddenly said "orders from manegement: we dont buy HP anymore" and several other companies reported the same thing: they stopped doing business with HP. At the same time.

And I also saw that several big companies recently choose to bet heavily on ARM. Microsoft released Windows for ARM cpus, that was unheard of! MS had support for non intel way back, but today? Why would MS bet on ARM suddenly? Release Windows for ARM? And AMD did the same, AMD is now designing ARM cpus. And Nvidia too. And HP will sell ARM servers. And several other companies. Suddenly they all simultaneously choose to bet heavily on ARM. Or Linux. Or Google. Or Microsoft Windows instead of OS/2. etc. Why do they act as one single will is governing them?

You want to know the answer? Here it is, read the answer here: here is a smoking new hot research, a PhD thesis and other senior researchers on this subject:


Fed up with Windows? Linux too easy? Get weird, go ALTERNATIVE



"....> Open VMS lives on in the architecture of Windows NT

Except for a whole bunch of differences that mattered like putting graphics drivers in kernel space which has always been one of NT's Achilles heel (but also necessary for the gamers and partially alleviated with modern WDDM). I am not sure of the numbers but I think people would be amazed how many blue screens are due to poorly written 3rd party drivers (and hardware failures themselves of course) as opposed to poorly written Microsoft code (some there too though especially in past)...."

You are not really updated. Windows have moved out the graphics from the kernel, so latest of incarnations of Windows are lot more stable than when Windows had the graphics in the kernel. For instance, Windows7 can update it's graphics driver without rebooting nowadays.

Funnily enough, Linux has moved it's graphics into the kernel. This has made Linux even more unstable, but Linux has increased it's performance at the cost of stability. How good is an OS if it is fast but unstable?

Havana see your bare metal: New OpenStack v8 clinches containerization


Solaris containers

are far older than 2005. It has been reworked several times, and the last iteration wast released 2005. It stems back from 1999.

Anyway, this Linux tech seems it is heavily inspired by Solaris containers. The cool thing about Solaris containers back then, was it remapped all kernel calls to the Solaris kernel - this was new. So when you installed Linux Red Hat, it would remap all Linux kernel calls to the Solaris kernel, so only one Solaris kernel was active no matter how many Containers you booted. Each container only cloned a few kernel data structures in RAM (40MB RAM), and also cloned the filesystem (100MB) via ZFS. So Containers are really resource saving, that is the point of Solaris containers. One guy booted 1,000 containers on a 1GB PC back then, it worked but it was really slow. Now, this Linux tech seems awfully similar, just like systemd is a clone of Solaris SMF, Linux btrfs is a clone of Solaris ZFS, Linux systemtap is a clone of Solaris Dtrace, Linux open vSwitch is a clone of Solaris Crossbow, etc etc etc.

It would be really cool if Linux developers did something new themselves, instead of cloning what others have done. For instance, the Linux "RCU" which is presumably cool according to Linux developers, turns out to be patented tech from IBM. So, RCU is not an invention by Linux devs either. BTW, did you know that the "Linux" kernel itself, is a clone of Unix? Everything in Linux is a clone, nothing is new?

Oracle's Ellison talks up 'ungodly speeds' of in-memory database. SAP: *Cough* Hana


Re: Big memory


Ok, you mean that If Bixby supports 96-sockets, then you can not use ordinary M6-32 servers, you must use modified M6-32 servers? Maybe you need to insert another card into each M6-32 server? Is that it?

Are you implying that for Bixby to connect 96 sockets, you can not use three of the normal M6-32 servers, but you need to use another type of server that has not been announced yet? I doubt that, because these large servers are expensive to make. It would be more economical to allow three of the M6-32 servers to connect via some extra hardware, using Bixby. But you dont agree with my guess? You mean there is another type of server coming? Do you have information on this, or is it your guess?


Re: Big memory


Ok, I did not know that. How do you know that? Do you have more information? I mean, Bixby builds a huge 96-socket server from building blocks, and the building blocks are the M6-32 server. So, I thought you could use several M6-32 servers to build a M6-96? But this is wrong? What link have you read to learn more?


Big memory

Three of the new 32 socket Oracle M6 server will be able to connect with Bixby interconnect, into a huge 96-socket M6 server with 96TB of RAM, and 9.216 threads. If you run your database from 96TB of RAM, and also compress the data - it will be very fast. I doubt SAP Hana can compete with such a huge server. How much RAM can Hana utilize? Can Hana go higher than 96TB of RAM? I doubt that. Anyone knows?

Red Hat and dotCloud team up on time-saving Linux container tech


Clone of Solaris Containers

"....This approach beats VMs in terms of resource utilization, as the OS copy is shared across all apps running on it, whereas virtual machines come with the abstraction of separating each OS onto each VM, which adds baggage...."

Solaris Containers have done this for ages. Linux is cloning Solaris tech again, as cloning ZFS, DTrace, SMF, Crossbow, etc.

One difference to Solaris Containers, is that Solaris allows virtualization of different kernel versions. You can even install Linux in one Solaris container. So there is only one Solaris kernel running, and all other virtualized kernels just remap API calls to the single Solaris kernel. One Sun guy started 1.000 Containers on a 1GB PC Solaris, it was very slow but it worked. Each Solaris container uses something like 40MB RAM and 100MB disk space, by cloning som kernel data structures. They are very efficient.

And now Linux is getting them too.

So, Linus Torvalds: Did US spooks demand a backdoor in Linux? 'Yes'


Re: Other ways to get a back door

"...Strange then that [all code] does get checked..."

Sure it gets checked. But the point is that it is not checked thoroughly. It is only skimmed and lot of subtleties are not catched. There are question marks in the code that gets accepted, because the code turn over is so high, no one can thoroghly check all code. Lot of code that no one really understands gets accepted. Maybe they contain subtle back doors?


"....Lok Technologies , a San Jose, Calif.-based maker of networking gear, started out using Linux in its equipment but switched to OpenBSD four years ago after company founder Simon Lok, who holds a doctorate in computer science, took a close look at the Linux source code.

“You know what I found? Right in the kernel, in the heart of the operating system, I found a developer’s comment that said, ‘Does this belong here?’ “Lok says. “What kind of confidence does that inspire? Right then I knew it was time to switch....”


Other ways to get a back door

NSA dont have to ask Linus Torvalds himself anymore. They can just submit a patch, because there are so many patches into Linux all the time, it is hard to check all new code. Apparently, this attempt was blocked. But how many more are not blocked? In Windows, NSA can not submit a patch, so NSA must ask Microsoft to deliberately insert a patch. But for Linux, there is a very high code turnover, so it is not hard to submit some new code:


"If you were the NSA, how would you backdoor someone's software? You'd put in the changes subtly. Very subtly."

"Whoever did this knew what they were doing," says Larry McVoy, founder of San Francisco-based BitMover, which hosts the Linux kernel development site that was compromised. "They had to find some flags that could be passed to the system without causing an error, and yet are not normally passed together... There isn't any way that somebody could casually come in, not know about Unix, not know the Linux kernel code, and make this change. Not a chance."

KVM kings unveil 'cloud operating system'


Sounds like SmartOS

The Joyent people have released their SmartOS, which is a OpenSolarish. It is used successfully, as a Cloud OS to deploy Virtual Machines with Solaris Zones and KVM.

China's corruption crackdown killing off Unix


Re: Switching from big iron to x86 virtualisation

"...Another point to consider with respect to your assertion that you will get good scaling from unmodified binaries on an M9000 is that it relies on running a huge number of threads to achieve high system throughput at the expense of a big hit in single-thread throughput. Legacy binaries are usually tuned to run on high-single-thread throughput systems, so I would expect to see better scaling and throughput from a system with a high-throughput cores (eg: Xeon) vs low-throughput cores (SPARC Tx)...."

Jesus. You are just totally off. Off. You havent understood much. It is the opposite: The M9000 is old, and use the old SPARC64 cpu which has 2 strong threads per core, and low throughput. The new "SPARC Tx" has very high throughput cores, with many threads. Xeon has low throughput and SPARC Tx (T5, etc) are high throughput. Xeon has in comparion: few strong cores and few threads, and SPARC T5 has many threads to boost throughput. This is just the opposite of what you believe.

Relatively this holds, if you compare them:

Xeon: low throughput because it has few strong threads

SPARC64: low throughput because it has few strong threads

SPARC Tx (T5): high throughput because it has many weaker threads.

You are stating the opposite. Just check the some of the world record benchmarks here, and see that SPARC T5 is SEVERAL times faster at high throughput server work loads (inlcuding SPECIint2006):



"....Probably because people like Oracle have their customers by the short and curlies. $legacy_vendor gets more margin if they can convince you that buying one of their new boxes will cost less than a new $legacy_vendor license + $competitor's box. The $legacy_vendor can set the license for the competition's box to cancel out the price difference AND they can maintain a nice fat margin because there is no competition (that's the whole point of legacy lock-in)...."

Wrong again. The Oracle database runs on Linux too. In fact, it is mainly developed on Linux I have heard of lately. So it would be very easy to migrate from Oracle database running on a very very expensive 32 socket server, to a cheap 32/64/128 socket Linux SGI cluster running Oracle database. But no one is doing that, why? And Oracle is not famous for cutting prices when they have you locked in, they are expensive. Many companies wants to migrate to other databases because of the high license costs. If you did not know this, I doubt you have worked at Wall Street as you claim.

Why dont no one migrate from a very expensive 32 socket Unix server to a cheap SGI cluster - running the same software? No vendor lock in exists, because you migrate from Oracle, to Oracle. Your explanations why noone does this are logically unsound. I tell you the answer: as the kernel developer explained, these cheap Linux clusters can not run large Oracle database configurations which requires SMP servers, and that is why no one migrates from expensive Unix SMP servers to cheap Linux clusters. Because clusters can not handle huge database configurations. The worst case RAM latency is more than 10.000ns in clusters, and performance of a database would grind to a halt.


"...I doubt [SGI cluster] is cheap...."

Wrong. Yes, the SGI cluster is cheap. Check the prices. It can in no way compare to 32 socket IBM P595 server used for the TPC-C record, it costed $35 million list price. I am convinced the SGI cluster costs like a few x86 cpus and a fast switch and not much more, or maybe twice that cost. You can buy several SGI clusters for the price of one 32 socket IBM server.


"...[SGI UV1000] it's not a cluster, it runs a *single* instance of the OS against shared memory. A single process can use every single byte of memory in that system. The same is not true of a cluster...."

You are wrong again. For instance, the ScaleMP Linux server with 1000s of cores shows the same charasterica: it runs 8192 cores and loads of RAM, and it runs a single Linux kernel image. But, it is actually a cluster. Just because it runs single image does not mean it is not a cluster. It can only run HPC workloads, just like the SGI cluster. Both clusters consists of several smaller nodes, connected to look like one giant server running single image kernel:


"... vSMP takes multiple physical servers and – using InfiniBand as a backplane interconnect – makes them look like a giant virtual SMP server with a shared memory space....The vSMP hypervisor that glues systems together is not for every workload, but on workloads where there is a lot of message passing between server nodes – financial modeling, supercomputing, data analytics, and similar parallel workloads. Shai Fultheim, the company's founder and chief executive officer, says ScaleMP has over 300 customers now. "We focused on HPC as the low-hanging fruit".... "

Check the SGI and ScaleMP workloads, they are all HPC workloads. For a reason. No customer runs a large database configuration such as Oracle.


"...With SMP all memory is remote, so you are operating at worst case (but uniform) latency all the time, by contrast with NUMA you get the best possible latency for local accesses and the same worst possible latency for remote accesses as you would have with an SMP box...."

True. The Oracle servers are SMP alike, and has very low worst case latency, 500ns or so. In effect you treat it like a true SMP server. The SGI and ScaleMP clusters have latency of 10.000ns or much higher - these must be treated as clusters, and can only run cluster software. That is why all SGI and ScaleMP customers are running HPC workloads.


"...All the M9000 does is hide the latency from your code by putting your code to sleep and running another thread whenever it has to hit main memory. Xeons achieve a similar trick with HyperThreading..."

Wrong again. You are mixing the M9000 cpus with SPARC Niagara cpus. The Niagara cpus hide latency by switching to another thread when the cache pipeline stalls, and it has many cores and many threads to be able to achieve a very high throughput. The M9000 has the old SPARC64 cpu, with only 1-2 threads. The old SPARC64 is very similar to older x86 or odler IBM POWER cpus: few cores, 1-2 strong threads. All cpus where constructed like this long ago. Then came the SPARC Niagara and changed everything with many cores and many threads, and now every cpu is similar to Niagara with many cores and many threads: POWER7, Xeon. The SPARC64 is not good at high throughput, it has few strong threads, not many threads.

So, you dont know too much about M9000 or SPARC64 cpus. You are mixing them. That might explain why are so off with your knowledge.


"....I'm done. I can't see much point in writing stuff for someone who shows no evidence of being able to read and learn...."

I have proved that you are wrong in many(every?) bit of your reasoning. It is you who needs to read and catch up.


Re: Switching from big iron to x86 virtualisation

Jesus. Again, that SGI UV1000 is a cluster.


"One can view NUMA as a tightly coupled form of cluster computing."

And, again, there are fundamental differences between a cluster and a SMP server. As kernel developer explains:


"[HPC clusters] spend a huge majority of their time in "user" and only a minuscule tiny amount of time in "sys". I'd expect to find very, very few calls to inter-thread synchronization (like mutex locking) in such applications....Consider a massive non-clustered database. (Note that these days many databases are designed for clustered operation.) In this situation, there will be some kind of central coordinator for locking and table access, and such, plus a vast number of I/O operations to storage, and a vast number of hits against common memory. These kinds of systems spend a lot more time doing work in the operating system kernel. This situation is going to exercise the kernel a lot more fully, and give a much truer picture of "kernel scalability"

If you really believe such a SGI UV1000 256-socket cluster is going to run a Oracle database for a tenth of the price of a 32 socket SMP server at a higher performance, why dont all large Investment Banks at Wall Street do that? What are you thinking with? Are you thinking at all? Why is there a market for very very expensive 32 socket servers, if a cheap 256-socket SGI cluster can replace 32 socket servers? Seriously? You dont think at all?

A question, you never attended some logic courses or so? Never went to university?


Re: Switching from big iron to x86 virtualisation @ Kebebbert


"...Given the >3x performance advantage of the Xeon cores over the M9000 cores in SPEC rate figures I think that very few people would choose the M9000 for the kind of compute intensive workloads because it would need 3x as many sockets to achieve the same result with perfect linear scaling (ie: not gonna happen)...."

Wrong conclusion. If the Xeon core is 3x faster, then it does not follow that M9000 needs 3x more sockets. Because, there is a huge difference between core and cpu. Maybe the M9000 cpus has very many cores, maybe 1000s of them. Then you dont need 3x more _sockets_. Or, if the M9000 cpu has only one core, and the Xeon has 1000 cores, then M9000 need many more than 3x sockets to catch up on one Xeon.

But it is true that Xeon is faster at number crunching. The reason is becuase SPARC is designed for Enterprise. Which means SPARC has better RAS (if a cpu instruction was errorneous the cpu can rollback and replay the instruction just like in IBM Mainframes, Xeon can not do this), and the SPARC is better at SMP workloads because it scales better. You can run large databases on M9000, but not on Xeon servers.


"....I can't help but notice that these benchmarks, that you claim show Linux does not scale well, are running on HP hardware, meanwhile all the examples of great scalability seem to come from the SPARC/Solaris hardware..."

HP has good scalability on the same 64 socket hardware, when running their own Unix called HP-UX. They offer 64 cpu configurations when running HP-UX on the Big Tux server. But when running Linux on the same Big Tux server, the largest supported Linux cpu configuration is 16 cpus. No 64 cpu Linux configurations are offered. Why is that? Is it because Linux has problems utilizing 16 cpus? With larger configurations there will be too much support problems? So, where are those superior Linux 32 socket benchmarks?


"....Let me know if you can find some benchmarks that test scalability for Solaris & Linux running on identical SPARC 16/32/64 socket hardware...."

I know that Linux runs on small 1-2 socket SPARC workstations, but it would be far fetched to believe you can move the same Linux to 32 socket SPARC servers without problems. That would need lot of tailoring and recompiling, and redesigning of the Linux kernel. So dont expect to see Linux running well on larger SPARC servers. SPARC servers have traditionally been targeted to scalability, not number crunching. Many SPARC benchmarks are about scalability, it is the strength.

However, there are Solaris and Linux benchmarks running on nearly identical x86 hardware. When using few sockets, 1-2 sockets, Linux wins sometimes and Solaris sometimes. But when you go up to 8 sockets (which is very small servers) Solaris wins easily. The more sockets, the better Solaris scales. On 1-2 socket servers Linux wins most of the time (I suspect). On small handheld devices, Linux wins I suspect. But on larger server, 8 sockets and upwards, Solaris wins - because that is Solaris domain. Solaris has long been made for scalability on large servers, and if Solaris did not win on large servers, that would be an indication something is wrong.


Re: Switching from big iron to x86 virtualisation

@Anonymous Coward

"...The Oracle Solaris engineers seem to think the Sparc Enterprise servers are NUMA architectures.... Should I trust the Solaris engineers or Kebabbert? Tough call..."

Please read my posts again so I dont have to repeat myself all the time. But the heck, let me recap once more.

The M9000 with 32 sockets is not a true SMP server, neither is the new Oracle M6 server with 96 sockets. But they act like true SMP servers because of good design. It is very few hops to reach memory cells far away, the worst case latency on M9000 is 500ns and the best case is 100ns. The spread is tight. So in effect you treat the M9000 as a true SMP server. You dont need to reprogram your software, to make sure data is close to the current cpu, etc. No need for this.

In contrast, the SGI Altix cluster is a cluster. It has worst case latency of 10.000ns or 70.000ns, I cant remember the number. This means that you can only run some workloads on a HPC cluster. When you program for a cluster, you must make sure that the data is allocated close the current cpu, otherwise the performance will be very very bad. You must reprogram your software if you intend to run on a HPC cluster. For isntance, SMP workloads such as databases does not run on HPC clusters. No one is running large Oracle databases on a SGI cluster. For a reason.

However, the Oracle M6 server with 96 sockets and 10.000 threads and 96TB RAM - is specifically designed to run databases and other SMP Enterprise workloads. So, you dont have to redesign your software running on M6 server, just copy your binary to the M6 server without problems. In effect the M6 server is a SMP server.

Did you understand why the Oracle M9000 (which is a non SMP server) behaves like a SMP server in real life? You dont have to redesign all your software. But for a HPC cluster, you must.


Re: Switching from big iron to x86 virtualisation


The SGI Altix UV1000 cluster sure beats any SMP servers on HPC number crunching workloads. The SGI Altix cluster was made for it. I would be very surprised if a Enterprise SMP server such as the Oracle M9000 beat a pure number crunching cluster in number crunching.

On the other hand, if you benchmarked the SGI cluster on SMP workloads, you would see very bad performance. The M9000 is made for SMP, and SGI made for HPC. I dont know how many times I have to say this? They are running different workloads, one is running number crunching, the other is doing SMP. The SMP can run trivially parallelisable problems such as number crunching, but a HPC can not run SMP workloads with latency of 10.000ns to 70.000ns - that would be impossibly slow. That is the reason no one on Wall Street use HPC servers for databases and other Enterprise workloads.


Re: Switching from big iron to x86 virtualisation


I was trying to explain to you, how the Linux fanboys are wrong when they talk about Linux supreme scalability. I am not meaning that you are like them. Sorry if I was unclear.

But I hope you see my point. There have never been any Linux 32 socket server for sale so the Linux kernel developers could not have improved Linux for 32 socket scaling. You need to test your code on live hardware, and such hardware does not exist. Sure, some people have tried to compile Linux onto Unix and Mainframes, but just because it works badly and scales badly, does not make them good options. There is no way in hell that Linux can scale well on 32 sockets, because there are no such benchmarks.

Well, actually, HP did benchmark once on 64 sockets (Big Tux server), and Linux sucked badly with ~40% cpu utilization. Of these 64 cpus, around 26 cpus were used and the rest 38 cpus were idling. More than half of the cpus were just sitting there, rolling their thumbs - under full load. That is bad actually. Very bad. You want to use all cpus and use all resources in a server. It can only take a deluded Linux fanboy like "Roo" to think that is a good result. 40 cpus idling on a 64 cpu server under full load - is that good scaling???


Re: Switching from big iron to x86 virtualisation


"...CPU utilisation is a pretty meaningless statistic. What would you rather have 10 transactions per second with 100% CPU utilisation or 1000tps with 87% utilisation ? You can achieve 100% utilisation with a busy loop ffs..."

Please read my post again. I am saying that Solaris, using slower hardware than Linux, scored higher on SAP benchmarks. Why is that? At the same time, Solaris had higher cpu utilization and Linux lower. Coincidence? Maybe. The point is, in 8-socket SAP benchmarks, Solaris scored higher than Linux. On equivalent hardware. Both using Opteron cpus, same opteron model, but one clocked at 2.8GHz (Linux) and one clocked at 2.6GHz (Solaris).

Cpu utilization is very meaningful - if you run the same benchmarks. If SAP utilizes resources more on one OS, than another OS - is that "meaningless"? Most people dont agree with you.


"...Repeating a falsehood when the truth has already been pointed out and the facts are easily available to everyone reading the thread is a silly thing to do...."

Ok, show all us the truth then. It is only a matter of linking to a Linux server SMP alike with 32 sockets. No, IBM Mainframes with 24 sockets running Linux does not count. Neither does a IBM AIX Unix server running Linux. So, please show us a Linux server instead of calling me a liar. Can you, or can you not? If not, then dont call me a liar, because that makes you the liar - you lie about me.

The only 32 socket server capable of running Linux today, is a Unix server. The IBM Unix P795 is only a couple of years old (3 years or so), before the IBM P795, where were any Linux server with 32 sockets? No one existed. And still, prior to the P795 the Linux fanboys claimed that Linux scaled extremely well. On what hardware? That is just laughable. I mean, where is the proof? Where are benchmarks? I can never understand how Linux fanboys can go and raving about Linux excellent scalability - but it has never been tested! Well, it has actually been tested once, on the HP Unix server with 64 cpus, and Linux scaled horribly according to HP benchmarks.

"The truth has been pointed out" - what truth? Seriously? In what reality do you live? WHERE ARE THE 32 SOCKET LINUX SERVERS??? WHERE???


Re: Switching from big iron to x86 virtualisation


"...I apologize to the readers for not having given the proper icon first way round. After all, the orig comment I replied to was about 32 CPUs, not 32 CPU sockets ... in any case, I agree that "more" of some sort will give you bragging rights amongst a certain audience..."

So, what is the difference between 32 cpu and 32 socket server? In each socket, there is a... cpu? Right? So, shouldnt 32 cpu server be the same as 32 socket server? I agree that the core count might be different, but a a cpu with 16 cores, or a cpu with 4 cores - they are both one single cpu. And they both sit in a single socket. Or do you have another explanation?

Regarding Linux and 32 socket servers. There are lot of Linux fanboys that claim Linux scales so well, it is the besto, they say. Well, I ask, how do you know that? Can you show me Linux benchmarks on a 32-socket server that proves Linux scales well? And no, they cant. Because there has never been a 32 socket Linux server for sale. So, how do they know that Linux scales well on large servers with 32 sockets? Pure imagination, I guess. And when I ask for benchmarks on a 32 socket server, they call me names, "dumb", "idiot", etc. At that point I know I have won. Because if they do really have proof they would have shown us the links. But they have nothing, so they revert to using harsh language instead. Just like Linus Torvalds himself.

So fact is: Linux scales very bad on larger SMP alike servers. It has troubles scaling on 8 socket servers, just study some benchmarks. For instance on SAP benchmarks, Linux used higher clocked cpus and faster RAM sticks, and still Solaris got a higher score, because Solaris had better cpu utilization at 99%, whereas Linux had 87% cpu utilization. Linux scales crap on SMP servers, because no kernel developer can make Linux scale well on large SMP servers, because there exist no such server to test Linux on. However, on clusters Linux scales well, and no one denies that.

Linux scaling on SMP servers: problems scale above 8 cpus, because there are no larger SMP server than 8 socket. So no Linux kernel developer can tailor the Linux kernel for 16 sockets or 24 or 32 sockets. Or 96 sockets, as the Oracle M6 server has. Solaris, AIX and HP-UX developers have had access to large 32 socket servers for decades for testing, so these Unix runs fine on such servers.

Linux scaling on HPC clusters: very good.


Re: Switching from big iron to x86 virtualisation @ Kebebbert

"....[You can copy your Solaris binary to a SMP server and expect it to perform well without rewriting it] That is only a given if your binary is solving a trivially parallelized problem, and in those cases a cluster will do just as well...."

Not at all. I dont know how much you know about computer science or programming, but these Oracle SMP alike servers are not only running NC-complete problems. If you believe that, I suggest you study the customers (enterprise companies), and what they use the Oracle servers for: typical SMP workloads such as big databases.

And then I suggest you compare the customers for HPC clusters such as the SGI Altix server and their workloads. You will find the customers are researchers, weather forecasting, oil companies etc, and they all do number crunching. No one runs big databases or other SMP work loads.

I dont know how many times I need to say this? Just compare the workloads and see what the SGI Altix server is used for, and see what Oracle and IBM and HP unix servers are used for. You will there is a big difference. Do you understand now, at last? These servers are used for different tasks, one for SMP things, and the other for HPC number crunching.


"....Ah, so a 32 socket server that is running Linux doesn't count because it can run Windows, HP-UX or AIX too ? You must be a fully paid up member of the Flat Earth Society...."

I am trying to say that these big Unix servers made by IBM, HP and Oracle, are Unix servers. Benchmarks from HP shows that Linux does not run well on 64-socket SMP servers. My point is that there are no Linux vendor designing Linux 32 socket SMP servers. No one. If you need 32 sockets, you need to go and buy an large expensive Unix server and compile and install Linux for it - and pray Linux does not fall apart on the server. I would hardly classify this configuration as a "Linux server". These Unix servers were built for Unix, and Unix scales well on these servers. Linux is hardly supported on them, on the HP-UX server, Linux is only supported up to 16 cpus - do you really call it a "Linux server"???

Just because my car can run on race tracks, I dont belive it is a Formula1 car - do I? Exactly what is it that you have difficulties in understanding? You dont agree that all these large Oracle/IBM/HP 32-64 socket servers were built for running Unix and that HP has provable big troubles running Linux above 16 cpus? What is so difficult to understand?

Do you still believe Linux scales well on 32 socket servers? If you do, on which server have you seen those benchmarks? Not on HP's servers, because they have documented bad scaling. Maybe you have seen Linux benchmarks on the IBM P795 server? Can you show us them benchmarks? If you can not show us benchmarks on IBM P795, which server have you seen good Linux benchmarks on? Not HP. Not IBM. Not Oracle. Hmmm... there are no other 32 socket vendors. I wonder how this "Roo" guy can claim Linux scales well on 32 sockets - because it doesnt. Maybe he is talking right out of his nose and making up things, and have a vivid imagination?


Re: Switching from big iron to x86 virtualisation @ Kebebbert


Of course you should consider caches and align vector properly, when programming SMP servers. Of course there will be differences in best worst case latency in SMP, because of when cache pipeline stalls, etc. Who on earth would believe the opposite???

My point is that when you program for a NUMA cluster with worst case latency of 10.000ns or more, you need to carefully redesign your software. You can not copy your binaries to a NUMA cluster and expect it to perform well, it won't.

But for a SMP alike server such as Oracle/Sun servers, you can copy your binary to it and expect it to perform well. That is why Oracle is building the M6 server with 96 sockets and ~10.000 threads and 96TB RAM. Oracle intends to run Databases on it, as Larry Ellison said: he does not believe in SAP Hana RAM databases, because Oracle databases will run as good as (if not better because of more RAM) than the Hana RAM database. You will never see anyone run a database on a NUMA cluster, that would drag down the performance to a halt.


"...You have already been given plenty of examples [of 32 socket Linux servers], you chose to assert they are not Linux servers which is pretty dumb seeing as they are servers that run Linux...."

I have been given two different Linux NUMA servers. Ever. And I myself have given example of a third NUMA cluster, that is the ScaleMP server. As we all know, NUMA servers are clusters. And they dont run SMP workloads, such as huge database configurations. They are only doing number crunching.

I have also been given ONE single example of a 32 socket server, that is the IBM AIX Unix P795 server. I myself has given an example of a 64 socket server, the HP-UX Itanium Superdome server. But as we all know, these Unix servers, are.. Unix servers. Linux scales awfully bad on the HP-UX server, it is hardly supported (only up to 16 cpus). The IBM P795, I expect Linux to scale as bad on it too. I would be surprised if any customer in the world runs Linux on such an expensive server. For the price of one single POWER7 cpu, you could buy a cheap 4-socket x86 server. Nobody runs Linux on an expensive IBM Unix AIX P795 server, I am convinced. The predecessor, the old 32 socket IBM P595 server for the old TPC-C record costed $35 million list price. Who would run Linux on a $35 million server? Why not buy a bunch of cheap x86 servers?

So again: I invite anyone to show links to a Linux 32 socket server, which is not an existing Unix server. We are talking about enterprise, and enterprise runs big databases, not number crunching. Is there any Linux 32 socket server out there? No? And it has never been. No matter how harsh language you use, it wont change the fact. Put up, or shut up. Show us the links, the proof.


Re: Switching from big iron to x86 virtualisation


"...If your definition of x86 servers can stretch (a lot) the Cray XC-30 might be of interest. It ships with the Cray Linux Environment, a cabinet can hold 384 sockets (3072 Xeon E5 cores), infinband I/O. You can add a lot of cabinets too. At the lower end you have the SeaMicro boxes ranging from 64 to 256 sockets (and I have seen them referred to as 'servers')...."

The Cray XC-30 is a cluster. It is a HPC cluster used for number crunching. Have you seen the workloads the Cray tackles? All embarassingly parallel workloads, running on lot of nodes, on a cluster. Cray does not make SMP server (a single big fat server, running for instance databases). Cray makes computing clusters, not Enterprise servers running Enterprise work loads. You will never see such HPC clusters running big fat Oracle databases, for instance.



Anonymous Coward,

"....256 socket NUMA single system image coherent global shared memory. It is not a distributed memory HPC cluster.... The IBM P795 has 32 sockets and SuSE Enterprise Linux is one of the supported systems...."

First, ccNUMA servers are a cluster. Not SMP servers, nor close to SMP servers. They can not handle SMP workloads, they are a cluster:


Second, the IBM P795 is not a Linux server. It is an AIX server that someone has compiled Linux for. I doubt anyone runs Linux on it, because the P795 is so expensive. It is better to run Linux on cheap x86 servers. Besides, Linux would never be able to scale to 32 sockets. HP had their "Big Tux" Linux server, which was the 64 socket Itanium Integrity (or was it Superdome?) server that they compiled Linux to, and Linux had something like ~40% cpu utilization using 64 sockets. Linux scaled so bad on 64 sockets, that when HP sold Big Tux, HP only allowed Linux to run in a partitioned server. The biggest supported Linux partition on Big Tux was 16 cpus. If Linux scaled well, HP would have supported 64 socket partitions too. But they didnt. 16 cpu Linux partitions did not work that well, either. If you look at modern benchmarks of Linux on a 8-socket x86 server, the cpu utilization is quite bad. For instance, SAP benchmarks show 87% cpu utilization on a 8-socket server. 16 socket would give... 60% cpu utilization, I guess. And 64 sockets does give ~40% cpu utilization in confirmed benchmarks. I am convinced the IBM P795 Linux offering is very limited in terms of Linux scalability, like, it has 40% cpu utilization, or P795 is only allowing max partitions of 8-16 cpus.

So, no, there are no Linux SMP alike servers. Some people have compiled Linux to big Unix servers, but that does not make them Linux servers. For instance, you can run Linux on a IBM Mainframe with 24 sockets, but that does not make the IBM Mainframe, a Linux server.


If you look at the RAM latency of a true SMP server, it has uniform latency from every cpu. No matter which cpu you are using, it accesses RAM as fast as every other cpu.

SMP alike servers, for instance the Oracle M9000 SPARC server with 64 sockets, has a worst case latency of 500ns, which is quite bad. But best case is something like 100ns or so. So the spread is tight, no big difference between worst case and best case. The Oracle M9000 is not a true SMP server, because there is some difference in latency. But that does not matter, see below.

If you look at the latency of a NUMA cluster like the SGI Altix 256 socket Linux server, the worst case latency is something like 10.000ns, or was it 70.000ns? I can't remember, but I know it was above 10.000ns, which is catastrophic. This changes everything. If you develop for the SGI server, you must allocate data close to the current node, and you design your software differently than for a SMP server - otherwise the performance will be extremely bad. In effect, you design your software exactly as if it was a cluster: you allocate data close to the current node, etc. And if you look at the workloads the customers are buying SGI for, it is for HPC workloads, and other clustered workloads. If you do SETI number crunching, each node does not have to talk to other nodes, that is a typical parallel workload that NUMA clusters handles fine.

If you study the new Oracle M6 server with 96 sockets, and ~10.000 threads, it is a mix of 8-socket SMP servers, connected with NUMA connections. But, the worst case latency is again, only 2 hops or so, just like in the Oracle M9000 server. If you need to access data, it will not take you more than 2 hops to reach it, which is fast. In effect, when developing for the M6 server, you design your program as it was a true SMP server. You dont need to allocate data to close nodes, etc, just develop your software as normal. So, Oracle M9000 and Oracle M6 runs SMP workloads just fine, and you dont have to redesign your software. Just take your current Solaris binary, and copy it to the M6 server and it will run fine. Try to do that on the Linux SGI cluster, and it will show extremely bad performance unless you redesign your program. Oracle M6 will run SMP workloads such as the Oracle Database in very large configurations. Databases is all Oracle cares about. So, in effect, the Oracle M6 and M9000 behaves as if they were true SMP servers, and you dont need to redesign your software to run on them. This is not the case using the Linux SGI clusters, they can never show good performance running large database configurations, with worst case latency of 70.000ns.

So, no, there are no Linux 32 socket servers for sale. Sure, IBM has one AIX P795 server they offer Linux on, but that does not make it a Linux server. And IBM offers Mainframe with 24 sockets to run Linux, but that does not make the Mainframe a Linux server either. If someone can show a link to Linux 32 socket server, I would be very surprised, because no one has ever manufactured such a Linux server. And NUMA servers dont count, they are just a cluster. Anyone can make a cluster, basically, just slap on lot of PCs on a fast switch and you are done.

So, I invite anyone to show a Linux server with 32 sockets. There have never been until today. Why? Linux scales very bad, shows benchmarks from HP on their HP-UX 64-socket server. Linux can not go beyond 8-socket servers today. And Linux does not even handle 8-socket servers well, just look at such benchmarks, and read about "cpu utilization".


Re: Switching from big iron to x86 virtualisation

"....Refresh your tech knowledge. There's quite a few options in the x86 space these days that offer 16 or 32 CPU cores. Even if you count chips / sockets, you have a range of choice in the 8-socket space (remind me there, how many CPU sockets did a T5-8 have, again ... ?).

Yes, there's not that many x86 servers out there that can have 32 CPU sockets. If that's what counts for you, go IBM / Fujitsu / Oracle...."

I am glad that you agree with me: There are no 16 or 32 socket Linux SMP servers for sale, and has never been. The T5-8 has 8 sockets. The Oracle M6 has 96 sockets, the Fujitsu M10-4S has 64 sockets. The IBM P795 has 32 sockets. HP has a Itanium Superdome/Integrity server with 64 sockets. There have never been a 32 socket Linux SMP server for sale. If someone objects, I invite him to post links to a 32 socket Linux SMP server. Good luck, there has never existed such as server. Sure, the SGI Altix 2048 core server is a HPC cluster, but there are no 32 socket Linux SMP servers for sale. The ScaleMP 2048 core Linux server is running a single image kernel on a cluster. The latency is so bad it is only fit for HPC workloads such as number crunching, just as the SGI Altix cluster.

The >32 socket server market is very very lucrative and high margin. The x86 business is low margin, and Larry Ellison has declared that he does not care if the x86 business at Oracle dies, because the margin is so low. IBM and Oracle does high margin business, that is where the big bucks are. For instance, the old 32 socket IBM P595 used for the old TPC-C world record costed $35 million list price. $35 million. I kid you not. That is some serious money. The largest IBM Mainframe has 24 sockets, and it is also very lucrative and one of IBM's big cash cows.

If you Linux supporters say that I am wrong, please show us a link to a 32 socket Linux server. No one has ever showed me such links. Never ever. They claim I am wrong, but... no links. Lot of talking, but no proof. I am wrong, then it is easy to make me shut up: show us a link to single 32 socket Linux server. :)


Re: Switching from big iron to x86 virtualisation

Linux is fine for low end server work, or for large clusters doing parallel HPC computations such as the SGI Altix server. x86 servers are also getting more and more powerful, so Linux can take over low end work loads. But there are no high end SMP Linux servers for sale and has never been, so for high end you must go to Unix or Mainframes. You have no choice. Enterprise work loads require large SMP servers with as many as 16 or 32 cpus, and no one has ever sold such large Linux servers. But for cluster workloads, HPC Linux servers are fine - but no Enterprise use HPC servers, only SMP servers.

Exploring our way to the source of EMC's mighty VNX Nile


Oracle new ZFS servers

The new Oracle ZS3 servers sets new world records again, beating NetApp, EMC, IBM, etc


When it doesn't get boring in series 8: Adaptec touts RAID RoCket fuel


A major problem with hardware raid: it is unsafe, your data might get corrupted, it is vulnerable to write-hole-error which makes you loose all your data. ZFS fixes all these problems.


Re: Hardware raid obsolete

Of course I am serious. There are no reason to use a hardware raid card over software raid such as ZFS. Hardware raid has only disadvantages: it is slow (a server has tons of more resources), costs lot (ZFS is free), vendor lockin (you can not migrate your disks to another card, you must buy an identical card from the same vendor), SPOF (if the card breaks you are toast, in ZFS you use several cards so you dont have a SPOF), etc etc and a dozen more disadvantages. Actually, I can not come up with a single reason of using hardware raid over software raid as ZFS.


Hardware raid obsolete

A hardware raid card is essentially a tiny PC on a card with it's own cpu, RAM, BIOS, raid software, etc - that is the reason they are expensive. You are buying a tiny PC.

Long ago a server's cpu was weak and you needed to offload I/O to raid cards. Today a server multi core cpu has plenty of power and you dont need to offload I/O anymore. For instance, to run ZFS on a single core, costs something like 2-3% of cpu power. That is nothing, so typically a raid card has 800 MHz 32 bits PowerPC cpu, and yes they are very weak.

Also a server might have 16-32 GB of RAM, whereas a raid card might have 256-512 MB RAM. A server is vastly superior in all aspects, so why dont you run your raid software on the server instead? Use software raid such as ZFS, and you get superior protection and performance. No 800MHz raid cpu can outclass a multicore 2.7GHz server cpu. So, who buys raid cards today? They are obsolete. Use software raid instead. Sell your raid card, and switch to open sourced software raid instead, and save money and gain performance.

Torvalds shoots down call to yank 'backdoored' Intel RdRand in Linux crypto


Re: Linus is totally wrong

Duke Arco of Bummelshausen

"...That will be sufficient for all your needs, believe me on this. Most people will be even OK with an RC4 stream...."

The same RC4 that NSA might have broken?



Re: I think Torvalds is losing it


"...I've seen Torvalds present (on GIT and the failings of the CVS / Subversion model). He began the presentation by saying "You can disagree with me if you want, but if you do then you're stupid and you're ugly". And you know what? It got a good laugh from the crowd...."

You know, they would laugh at anything he would say. They are his worshippers, and he is their God. He is flawless in their eyes. Even if Torvalds insulted and humiliated them, they would gladly accept being peed upon. They are brain washed, a sect.

No sane person would accept Torvalds behavior, as we can see in this thread.


Re: Who can tell?

Mixing random generators are never a good idea, it weakens everything if not done correctly. If you study the subject you would know it. But if you are a mere Linux developer, he would of course believe he knows everything.


Re: Linus is totally wrong

Bla.bla. I know the difference. I did some work on group theory and pseudo random generators. It turned out that the work was already known, but I did not know that when I started. You want to read my thesis on the subject??


Re: Simple h/w device?

That is a good idea actually. It should have a market, indeed. For instance, a small radioactive source, or microphone, or something similar. Another idea would be to record noise from a current microphone and extract randomness from it.

One friend at uni, had to create random numbers for a software, so he took a photo with the usb camera, and hashed the photo to extract random numbers. His software used a usb camera, so it had access to a usb camera.


Re: Linus is totally wrong

Werner McGoole,

yes I know all that. I studied cryptography for one of the leading experts in the world. He is world famous and if you have studied cryptography, you have surely heard of him.


Re: Linus is totally wrong

"...I know of the story you're referring to, and you're mis-stating it. First, the "mixed sources" random number generator used linear congruential generators -- no PC noise, ..."

No, you dont. I studied cryptography back then, and I remembered that some company, was it Netscape?, used the space left on the hard disk as one of the inputs to create random numbers. They used "PC noise", that is for sure. It seems you have not read the same story as I did.


Re: "Random"

@Anonymous Coward,

What "there is not such thing as random"? Have you read professor Chaitin's work on algorithmic information theory where they define the very concept of randomness? Read it and then come back.

(I apologize, but it is funny how similar you sound like a Linux kernel developer, who thinks he knows everything, when in fact he has not studied the subject or know nothing on the subject. Hybris all Linux kernel developers display. But I am not accusing you of this, I am just saying it sounds a bit funny)


Re: Torvalds needs a paranoia transplant

Netscape mixed different random sources, and introdcued a pattern so it was breakable. Donald Knuth says never to mix stuff, instead rely on a proven mathematically strong design. Just becase you can not break your own crypto does not mean it is safe. Read my post further down.


As I explain further below, Netscape(?) mixed different random sources (current millisecond, space left on hard disk, etc) with a random number generator - and researchers broke it.

As Donald Knuth explains in his Art of Computer Programming: mixing random sources is never a good idea. His own home brewn random generator which mixed lot of stuff, had a remarkably short period before repeating itself. Read his book on random number generators. It is obvious that Linux kernel developers have not, nor have they studied the subject of cryptography. Donald Knuth says it is better to rely on a proven mathematically strong system, than making your own. Read my post further down.


Linus is totally wrong

There was a famous example of... Netscape(?) did a mix of different random sources. They used a random number generator, added the current millisecond, how much space was left on the harddrive, etc to create a "truly" random number. But researchers succeeded in breaking it, because they knew what was the building blocks, they could infer things, such as "typical hard drive is this big", etc. So, the researchers succeeded in discarding lot of the search universe, so they could decipher everything. It was lot of work, but it was doable. To mix different sources does not make better randomness. The Linux kernel developers would have known this if they studied cryptography (which I have).

Donald Knuth has a very interesting story on this in his epitome "Art of computer programming". He was supposed to create a random number generator many years ago, so he mixed lot of different random sources, the best he could. And Donald Knuth is a smart mathematician, as we all know. After his attempt, he analyzed it and discovered a very short range. It quickly repeated itself. That learned Donald Knuth that he should never try to make a random generator (or cryptosystem), just because you can not break your crypto or random number generator, it does not mean it is safe. Donald Knuth concludes in his book: it is much better to use a single well researched random generator / cryptosystem than make one yourself. Much better. If you start to mix different sources, you might introduce bias which is breakable. It suffices if the adversary can discard some number in the huge search space to be able to break it.

So, NSA and the like, would be more concerned if Linus used a proven high quality random generator. As Snowden said: NSA can break cryptos by cheating. NSA has not broken the mathematics. The math is safe, so use a mathematically proven strong random generator instead of making your own. That is very bad. If you study basic cryptography.

The Linux kernel developers seem to have very high thoughts of themselves, without knowing the subject? Probably they would also claim that their own home brewn crypto system is safe, just because it is complex and themselves can not break it. That would also be catastrophic. They should actually study the subject, instead of having hybris. But, with such a leader....

Don't bother competing with ViPR, NetApp - it's not actually that relevant


ZFS more common than NetApp and EMC Isilon combined:


"...We [Nexenta] alone have half as much storage, we figure, under management as NetApp claims. Add Oracle and you’re already bigger than any one-storage file system. Add all Solaris and illumos deployments on top of that and you are 3-5x larger than NetApp’s OnTap. In fact, the number of ZFS users is larger than those using NetApp’s OnTap file system and EMC’s Isilon file system combined...."



My links are messed up, check them to find 10x more expensive NetApp, for slower performance.



NetApp does some great storage servers, at a high price. NetApp servers are running FreeBSD.

There are ZFS based storage servers running OpenSolaris, that are much cheaper than NetApp. Sure, ZFS are not clustered, but if you dont need clustered storage, then ZFS will provide ample resources, at a very low price. Check these ZFS benhcmarks, vs NetApp servers. ZFS is 32% faster, and NetApp is 10x more expensive:


And also, ZFS beats EMC / Isilon, NetApp, etc:



There are other ZFS vendors as well: Tegile, GreenByte, Nexenta, etc

You won't find this in your phone: A 4GHz 12-core Power8 for badass boxes


Re: Catching up on SPARC T5

"...If single-thread performance is the most important thing for a piece of work, a core or set of cores will step down the threading automagically and run it with fewer processor threads...."

This sounds exactly like SPARC critical threads introduced in the old SPARC T4, where an important thread will take over a single core. So the core will only run that single thread. That is the reason the T4 has strong threads, and also extreme throughput if needed. You choose during run time.

And also, POWER8 cpus are finally getting transactional memory, first introduced to world in the SPARC T5 cpu. But Sun's old ROCK cpu also had transactional memory years ago - but it did not make it to delivery. So the ROCK research shows up in those SPARC Txx cpus instead. And Intel Haswell also has transactional memory now. So, SPARC T5 has transactional memory, later Intel Haswell also got transactional memory, and now IBM is trying to catch up with transactional memory too. Better late, than never, though.

As I wrote:

"....Funny how POWER is getting more similar to SPARC for every iteration...."

It would be cheaper for IBM if they just licensed SPARC cpus, so IBM did not have to catch up all the time.


Re: Catching up on SPARC T5


There are lot of inconsistiencies and plain wrong "facts" in your post. You should do some reading to catch up. Or are you up to date, and deliberately write false things?


"...Fast cores -> not sparc...." We have shown official benchmarks, and the SPARC T5 has 25% faster cores than POWER7. And, the SPARC T5 has twice the number of cores. So, yes, the SPARC T5 has faster cores, and is also the faster cpu. Up to 2.4x faster than POWER7 on TPC-C benchmarks.


"...Wow, after Oracle letting most of the Sun tech die a silent death..." What are you talking about? Oracle is capitalizing heavily on both the SPARC CMT cpus (T5) and the SPARC M6 cpus. And Oracle bets heavily on Solaris, too. And Java. etc.


"....OK, you are never out in the field so I will tell you how this works. If you are at Solaris 10 and want to go to Solaris 11 you cannnot upgrade. So one has to find or buy a new server, install your new OS and application there and then you have to replicate what is years of configuration from the old production server. In the end you migrate the data and do a switch. This takes weeks, is risky and has extreme cost. You can call Solaris "the most advanced operating system in the universe"(true, Sun has stated this!) all you want but it is still amateur night...."

You are totally off here. It is obvious you are never out on the fields. But let me tell you how it works. If you want to migrate a Solaris 10 server to a Solaris 11 server, you utilize containers (you know, the tech that IBM AIX copied and named it WPAR, just as IBM AIX copied DTrace and named it ProbeVue) and zip the entire Solaris 10 server, and dump it into an container on a Solaris 11 server. And then you are done. You can also do that with Solaris 9 and Solaris 8 servers, and dump them onto an Solaris 11 server, via containers.

You have some reading to do. Or, you have just not understood what Solaris Containers are good for.

Oracle revs up Sparc M6 chip for seriously big iron


Re: Bandwidth != LAtency

"...I wonder if the compiler / kernel will be able to attempt to 'intelligently' allocate or shift threads to cores where the other threads that need to 'talk' to the first one are a small number of hops away...."

I suspect it is not really necessary. On a NUMA cluster you must design your program like that, because worst case latency might be 10.000 ns or more. But on this M6 server, the worst case is only 2-3 hops away, which makes for a good latency. So, you just program this server as a true SMP server, just like normal programming, just copy your binaries to this server and off you go. You dont need to recompile and redesign to make sure your data is in close nodes. No need for this server. Just treat it like a true SMP server, because of the good design.

But on the other hand, if you want to port your Linux applications to a NUMA cluster such as SGI Altix servers, you must redesign the programs, and rewrite them. Otherwise performance will grind to halt, if you do not make sure that the data is located in close nodes. In worst case, it is almost like accessing a hard disk, because the nodes are so far away before you access the memory you need. So these Linux NUMA clusters are directed to HPC parallel workloads, just check the use cases and the benchmarks. They are all HPC cluster stuff. No SMP stuff.