back to article Stratus tolerates faults on Windows HPC

There are many ways to gang up machinery to scale applications on groups of servers or provide a measure of disaster recovery or fault tolerance for those applications. Supercomputer customers are known for spending big bucks on exotic technology, but they're also notorious cheapskates. That's why Linux and the clustering of …

COMMENTS

This topic is closed for new posts.
  1. Solomon Grundy

    Market Creation and Development

    Microsoft has always led the pack in market creation and development. Regardless of technical superiority, MS has consistently provided solutions, or provided avenues for 3rd party solution development. Their leadership built lots of things - from a PC in lots of homes, to wide spread Internet browsing, to high performance computing applications, and to Open Source Software. Without MS there would not be many user friendly (non-command line) options for using a computer. And the best part is the most useless people are going to trash me for what I said because they like command lines: unfortunately they aren't part of business...

    It only makes sense that MS will get some traction in HPC. They've gotten traction for themselves and many, many others in the past so this is just another opportunity to raise the bar in usable computing - which includes vendors, channels, developers, sales, and most important - USERS. Doesn't matter what's "best" you must please a significant portion of your users. Failure to do so leave you standing all alone with an unwanted product - sort of like lots of IT professionals have been left standing all alone with their willy in their hand and no one to use their product. It's marketing and partnerships that make products. Everything else is a technical issue that can be solved one way or another.

  2. John Benson
    Coat

    disappeared into Compaq, later part of HP?

    That's only accurate in the sense of corporate visibility. Tandem had already quietly appeared and then disappeared into the banking and stock trading infrastructure of the world. For example: if you gave ATM network owners a choice between having their water or their Tandem systems turned off, there would be a run on Porta Potties.

    Scaling to serve read-only webpages is a piece of cake since there is no need for session management. Scaling to serve sessions is harder, and gave rise to the whole J2EE stack and other expedients. Scaling to ensure financial transaction integrity and continuous service is another matter altogether, since you can't solve the problem by just throwing machinery at it: this is why the secret sauce has always been the Tandem OS and its transaction monitor as much as the changing hardware generations underneath them.

    Although I'm only a Tandem and not a Stratus programmer, my understanding is that Stratus at least started out with a more hardware-oriented fault-tolerance approach than Tandem. But Tandem started out as a shared-nothing system: no single hardware failure could take the system (and therefore the application) down. As such, Tandem was solidly grounded in both software and hardware fault-tolerance.

    Although the backupability and network mobility of virtual machines has tantalizing implications for near-continuous availability, I'm not aware of any VM-based system that guarantees no lost or duplicate transactions. Until one arrives, ATM and stock-market operators would rather fight than switch.

  3. J
    Alert

    @Solomon Grundy

    "Their leadership built lots of things [...] high performance computing applications, and to Open Source Software. Without MS there would not be many user friendly (non-command line) options for using a computer."

    Man, you're either Steve Ballmer on disguise or whatever you're smoking is REALLY weird stuff...

  4. Pierre

    My dear Solomon , sir,

    Your fanboyism is touching really. Though I admit that MS has done a lot for the software world, it has mainly been through the pressure they put on competitors. They never really invented anything, they took ideas here and there and pushed that, half backed in general, down Joe Bloggs' throat. MS is basically a huge reverse-engineering sweatshop with an eyecandy sprinkler above the exit. At their best, they managed to put two "picked" ideas together to increase user-friendlyness, but that's all (you might argue that it's alraedy a lot, and you might even be right, but point still stands).

    As for the present case, the principle is laudable, but MS has such an impressive track record in deception that I think I'll pass. Maybe when it's _demonstrably_ more (I was gonna say resilient, but let's not push it) stable than the competition... and even then, with half a brain you can get 5 nines out of your "it went for free in a cereal box" x86 cluster, so if they want to sell their stuff (agressive marketting aside) they will need to do much better. And then they'll be in direct competition with the really big boys. Resilient and damn powerful and shit. I'm sure the hardware will be up to it, but ... well, you know. Things MidaS touches, all that.

  5. James Anderson
    Happy

    DUH!

    Running a flakey insecure unreliable OS on 99.9999% reliable hardware is not going to result in a reliable system.

    The reason for the popularity of UNIX based multisite HA clusters is that they protect you against all the "biblical" failures (fire, flood and electricity famines). Coupled with a reliable OS and well written software you have ninty nine point something reliability depending on how much money you have to spend (as a rule of thumb each extra 9 will double your budget!).

    The sad fact is that a modern unix HA cluster is probably more reliable than the company that owns it. We dont have the technoligy to protect the systems from their management.

    PS. @Soloman Grundy -- keep taking the dried frog pills!

  6. Anonymous Coward
    Joke

    @Solomon

    Don't be an idiot, flame rant rant, how dare you speak coomon sense, how dare you not slag of M$. Where is your herd following instincts, <OS> roks, M$ SUCK and is root of all evil, blah blah blah......

    Thought I'd flame you for no other reason than it's cool and trendy to bash M$, even if you are speaking sense...

  7. Mark

    Stu stFu

    Don't PROVE you're a retard.

  8. TheDude
    Thumb Down

    Stratus doesn't live up to the marketing

    This would be great if Stratus was a clean and transparent hardware solution; but it isn't.

    The Stratus Windows 2003 install relies on a *lot* of proprietary drivers and is hooked into Windows on such a fundamental level that upgrading Windows patches can break the Stratus drivers! I have seen nodes randomly rebooting *both sides* of an FTserver after installing a regular set of patches. Stratus' software is also only rarely updated, which means you have to wait for Stratus to give you a thumbs up on any Microsoft updates you want to install.

    Not to mention that a Stratus server with 2-year old hardware (2xdual core Core 2 Duo and 4GB of RAM with SATA, yes, SATA disks) costs over £15000!!! How is that scalable? Why not buy 10 commodity servers instead? Not to mention that your pathetic 4 core server is taking 4U of rack space!! I can get 8 cores in 1U, why do I want this?

    All in all I cannot support Stratus and having worked with them for a couple of years I have seen no reason to approach them again whatsoever. They happen to be the only vendor in this particular market but that doesn't mean that they have a good product.

    All in all another case of good marketing, bad product. Stay well away....

  9. Daniel B.
    Thumb Down

    @Solomon

    Hm... I doubt we'd be still using command-line systems even if MS had never existed. We already had ISPF, that Xerox graphical environment, the X Window System and even Macintosh. (Note: It wasn't called "MacOS" until sometime around 7.5)

    Hell, some large organizations still use ISPF to this day, including my local telco and some banks. Even the web basically works as a glorified 3270 terminal!

    I somehow doubt MS would get any significant traction in the HPC business, HPC users are more of the scientific type, not the "average Joe Bloggs" user they're used to. Its bad enough that the RISC architecture and vector processors have been mostly displaced by el-cheapo x86 junk; now trying to put Windows on top of that is even uglier. I really hope that actual HPC users don't go down this road.

  10. Denny Lane

    "TheDude" ...time to look at Stratus again

    TheDude,

    As an employee of Stratus, I am sorry that you encountered problems with the ftServers you worked with in the past. Maybe it is time to look at them again.

    Our products are designed to provide users with the highest availability possible for their most critical workload and deliver this solution in an operational simple fashion. This level of focus on availability is not for every customer nor every workload, just those that demand the most available systems in the market. As you may know we monitor the uptime (hardware & OS) of the thousands of ftServers installed around the world and it currently is 99.99990% - actual, not calculated. That's an average of 32 seconds of downtime with off the shelf Windows or Linux.

    A couple points to update you on:

    * Current models ship with the latest Intel Xeon quad core processors and 15K SAS hard disks

    * Our systems are 4U ( 2 x 2U) which is the same as if you were to cluster two HP DL380 servers

    * Pricing will vary depending on configuration, reseller, and geography. If you were to compare our system to the above mentioned HP cluster with shared storage - we would have about a 20% price premium. Most customers feel that is a small price for the higher availability and simpler operation than a cluster.

    For any IT professional building an HPC cluster, either Windows or Linux, should consider using a fault tolerant server at the critical points like the Head and Broker nodes.

    Denny Lane

    Stratus Technologies

This topic is closed for new posts.

Other stories you might like