* Posts by GreggSchoenberger

2 publicly visible posts • joined 29 Mar 2022

Datacenter migration plan missed one vital detail: The leaky roof

GreggSchoenberger
Mushroom

Facilities maintenance goes horribly wrong.

This story brought back memories, none of them good.

Our computer center was world-class, with machines worth 10's of millions of dollars. Unfortunately sitting in an older building that like many structures in California had a flat roof. And it was time for some maintenance work. The bids went out late in the summer, so work began close to the beginning of California's rainy season. The crew took off the old roof, put down foam insulating panels, sealed the joints, and coated the surface with the usual hot-tar mixture. Hot tar in 95+ Fahrenheit days is much more odious than normal and soon floors of the office building attached to the machine room building were filled with the sickening smell of dead dinosaur. Management actually showed a bit concern for me and temporarily moved me to another office located in another building upwind of the roofers. I just wish that had been the end of the tale.

Several days later the rains began in earnest. Steady, sometimes heavy rain. About two o'clock in the morning I got a call from my boss's boss.

"We have water coming into the machine room. I'm driving over to pick you up."

"WHHHAAATT?"

Oh there was water coming in all right. It was like a rain forest in there. Operations were using 55 gallon drums to catch the flow in an attempt to keep the place from flooding. My job was to help power down the supercomputers, smaller machines, and peripherals so that no one would get electrocuted. Given that some machines had 100 amp bus-bars feeding the circuitry it did seem prudent to take the gear offline.

It turns out that the roofers made a bevy of mistakes. They graded the roof incorrectly, flanges were improperly fitted, AND they left debris on the roof that washed into the drain. Which plugged up, allowing water to build up on the new roof then flow down into machine room.

I won't repeat what our program director was saying about the roofers early that morning - there were a lot of references to various forms of extremely brutal summary execution. The emergency work crews found the screw-ups, which led to a lot more violent language by our program director. Eventually the roof was properly water-tight and we all could sleep - albeit fitfully.

The wild world of non-C operating systems

GreggSchoenberger

Fortran-based Operating Systems

Lawrence Livermore needed an operating system with more features than what was offered by the vendors of supercomputers at the time - classified level support, time-sharing, etc. So they rolled their own. LTSS for the CDC and Cray machines, CTSS for Cray hardware, and later a new version of LTSS - NLTSS for Crays. All of these systems, utilities, support libraries were written in the local dialect of Fortran-77, later Fortran-90.

All of the systems used concepts developed for Multics, but were heavily modified to suit the needs of the laboratory computing community. One notable feature was the mapping of program memory to disk ( aka core dump ) was also used for swapping out when the time slice expired AND was restartable.

Eventually these systems were replaced by Unix derivatives - notably Linux.