Re: Interesting side effects of this development..
Let me toss in some ideas/facts :)
Windows NT was never x86/x64 only. It wasn't even originally developed on x86. Windows has been available for multiple architectures for the past 25 years. In fact, it supported multiple architectures long before any other one operating system did. In the old days when BSD or System V were ported to a new architecture, they were renamed as something else and generally there was a lot of drift between code bases due to hardware differences. The result being that UNIX programs were riddled silly with #ifdef statements.
The reason why other architectures with Windows never really took off was that we couldn't afford them. DEC Alpha AXP, the closest to succeeding cost thousands of dollars more than a PC... of course it was 10 times faster in some cases, but we simply couldn't afford it. Once Intel eventually conquered the challenge of working with RAM and system buses operating at frequencies not the same as the internal CPU frequency, they were able to ship DEC Alpha speed processors at x86 prices.
There was another big problem. There was no real Internet at the time. There was no remote desktop for Windows either. The result being that developers didn't have access to DEC Alpha machines to write code on. As such, we wrote code on x86 and said "I wish I had an Alpha. If I had an Alpha, I'd make my program run on it.". So instead of making a much cheaper DEC Alpha which could be used to seed small companies and independent developers with, DEC, in collaboration with Intel decided to make an x86 emulator for Windows on AXP.
The emulator they made was too little too late. The performance was surprisingly good, though they employed technology similar in design to Apple's Rosetta. Dynamic recompilation is not terribly difficult if you consider it. Every program in modern times has fairly clear boundaries. They call functions either in the kernel via system calls which are easy to translate... or they call functions in other libraries which are loaded and linked via 2-5 functions (depending on how they are loaded). When the libraries are from Microsoft, they know clearly what the APIs are... and if there are compatibility problems between the system level ABIs, they can be easily corrected. Some libraries can be easily instrumented with an API definition interface, though C programmers will generally reject the extra work involved... instead just porting their code. And then there's the opportunity that if an API is unknown, the system can simply recompile the library as well... and keep doing this until such time as the boundaries between the two architectures are known.
Here's the problem. In 1996, everyone coded C and even if you were programming in C++, you were basically writing C in C++. It wasn't until around 1999 when Qt became popular that C++ started being used properly. This was a problem because we were also making use of things like inline assembler. We were bypassing normal system call interfaces to hack hardware access. There were tons of problems.
Oh... let's not forget that before Windows XP, about 95% of the Windows world ran either Windows 3.1, 95, 98 or ME. As such, about 95% of all code was written on something other than Windows NT and used system interfaces which weren't compatible with Windows NT. This meant that the programmers would have to at least install Windows NT or 2000 to port their code. This would be great, but before Windows 2000, there weren't device drivers for... well anything. Most of the time, you had to buy special hardware just to run Windows NT. Then consider that Microsoft Visual Studio didn't work nearly as well in Windows 2000 as it did in Windows ME because most developers were targeting Windows ME and therefore Microsoft focused debugger development on ME instead.
So... running code emulated on Alpha did work AWESOME!!!! If the code worked on Windows NT or Windows 2000 on x86 first. Sadly, there was no real infrastructure around Windows NT for a few more years.
That brings us to the point of this rant. Microsoft has... quite publicly stated their intent to make an x86/x64 emulator for ARM. They have demoed it on stage as well. The technology is well known. The technology is well understood. I expect x86/x64 code to regularly run faster on the emulator than as native code because most code is optimized for an architecture where dynamic recompilers can optimize for the specific chip they are executing on and constantly improve the way the code is compiled as its running. This is how things like JavaScript can be faster than hand coded assembly. It adapts to the running system appropriately. In fact, Microsoft should require native code on x64 to run the same way... it would be amazing.
So, the emulator should handle about 90% software compatibility. Not more. For example, I've written code regularly which makes use of special "half-documented" APIs from Microsoft listed as "use at your own risk" since I needed to run code in the kernel space instead of user space as I needed better control over the system scheduler to achieve more real-time results. That code will never run in an emulator. Though nearly everything else will.
Then there's the major programming paradigm shift which has occurred. The number of people coding in system languages like C, C++ and Assembler has dropped considerably. On Linux, people code in languages like Python where possible. It's slow as shit, but works well enough. With advents like Python compiler technology, it's actually not even too pathetically slow anymore. On Windows, people program in .NET. You'd be pretty stupid not to in most cases. We don't really care about the portability. What's important is that the .NET libraries are frigging beautiful compared to legacy coding techniques. We don't need things like Qt and we don't have to diddle with horrible things like the standard C++ library which was designed by blind monkeys more excited about using every feature of the language than actually writing software.
The benefit of this is that .NET code runs unchanged on other architectures such as ARM or MIPS. Code optimized on x86 will remain optimized on ARM. It also gets the benefits of Javascript like dynamic compiler technology since they are basically the same thing.
Linux really never had much in the lines of hardware independent applications. Linux still has a stupid silly amount of code being written in C when it's simply the wrong tool for the job. Linux has the biggest toolbox on the planet and the Linux world still treats C as if it's a hammer and every single problem looks like a nail. Application development should never ever ever be done in system level languages anymore. It's slower... really it is... C and C++ make slower code for applications than Javascript or C#. Having to compile source code on each platform for an application is horrifying. Even considering the structure of the ABI at all is terrifying.
Linux applications have slowly gotten better since people started using Python and C# to write them. Now developers are more focused on function and quality as opposed to untangling #ifdefs and make files.
Now... let's talk super computing. This is not what you think it is I'd imagine. The CPU has never really meant much on super computers. The first thing to understand is that programmers will write code in a high level language which has absolutely no redeeming traits from a computer science perspective. For example, they can use Matlab, Mathematica, Octave, Scilab, ... many other languages. The code they write will generally be formulas containing complex math designed to work on gigantic flat datasets lacking structure at all. They of course could use simulation systems as well which generate this kind of code in the background... it's irrelevant. The code is then distributed to tens of thousands of cores by running a task scheduler. Often, the distributed code will be compiled locally for the local system which could be any processor from any architecture. Then using message passing, different tasks are executed and then collected back to a system which will sort through the results.
It never really mattered what operating system or platform a super computer runs on. In fact, I think you'd find that nearly 90% of all tasks which will run on this beast of a machine would run faster on a quad-SLI PC under a desk that had code written with far less complexity. I've worked on genetic sequencing code for a prestigious university in England which was written using a genetic sequencing system.... very fancy math... very cool algorithm. It was sucking up 1.5 megawatts of power 24/7 crunching out genomes on a big fat super computer. The lab was looking for a bigger budget so they could expand to 3 megawatts for their research.
I spent about 3 days just untangling their code... removing stupid things which made no sense at all... reducing things to be done locally instead of distributed when it would take less time to calculate it than delegate it... etc...
The result was 9 million times better performance. What used to require a 1.5 megawatt computer could now run on a laptop with an nVidia GPU... and do it considerably faster. Sadly... my optimizations were not super computer friendly, so they ended up selling the computer for pennies on the dollar to another research project.
People get super excited about super computers. They are almost always misused. They almost always are utterly wasted resources. It's a case of "Well I have a super computer. It doesn't work unless I message pass... so let me write the absolutely worst code EVER!!!! and then let's completely say who gives a fuck about data structure and let's just make that baby work!!!!"
There are rare exceptions to this... but I'd bet that most supercomputer applications could have been done far better if labs bought programmers hours instead of super computer hours.