@Brian
Never mind FPU accelerators - how's about the whole parallel thing? Anyone else remember the Transputer? The company who sponsored me at uni (GEC Alsthom Transmission and Distribution Power Electronic Systems Limited, to give it its full title) used them, and they were pretty damn cool.
The Occam2 language in particular was a neat idea, since it gave native support for parallelism. No messing around with the detail of threads and stuff - you just said "PAR", and the paths under that statement ran in parallel. If you happened to be at the top level then those paths got spread over the separate cores, or at lower levels it timesliced, but all that was done for you. And comms between processors was equally seamless - from a software PoV it just looked the same as running over separate cores.
For a little while it was the fastest thing around. Trouble is that like all British technology, no-one was prepared to put money into it. So it drifted backwards until a single 486SX25 could comfortably blow away a bunch of Transputer cores, and that was that.
Now of course we're back where we started, bcos single cores have basically run out of speed-up potential. And of course, since Win95 programs use multiple threads to do stuff in parallel with time-slicing. So everyone has the joy of managing threads for themselves without any decent techniques for tracking deadlocks and livelocks, when back in the late 80s and early 90s anyone using Transputers had already solved this problem.
This is why I differentiate between computer science and software engineering. Engineering is about solving problems and keeping the solutions around, so you don't have to reinvent it every time. A civil engineer doesn't have to go back to first principles for a bridge, bcos after 1861 the principles for building a strong bridge have been pretty well ingrained in the profession. The trouble with computer science is that in constantly chasing the bleeding edge, they seem to have absolutely no idea of the history behind where they are, or following patterns in this. So single-core code gets hacked to explicitly support dual-core when dual-core processors come around; and then someone releases a quad-core processor and the CompSci boys need to hack their code again for "if(cores==4)". The idea that there's a pattern involved - number of cores increasing - seems to pass them by, as does the fact that back in the 80s there was a ton of work done on load-sharing across parallel processors.