Oops
Red Bee: Doesn't really matter, the broadcasters have got a backup anyway.
Ch 4: Doesn't really matter, Red Bee have got a backup anyway.
Insert the adage about the backup hasn't been tested until you restore it here.
Confusion continues to reign in the world of television, including UK national broadcaster Channel 4, weeks after a broadcast centre cockup wrought havoc upon servers. Things went horribly wrong at Red Bee Media's broadcast centre back on 25 September. Yes, that was the weekend before we ran an accidentally appropriate episode …
All the HD masters for Babylon 5 were lost in the same way. The production company thought Warner Brothers would store them, Warner Brothers through the production company would store them.
The production company had the foresight back in 1993 to see that HD and widescreen would be the norm in future, so they recorded the show in both, but in such a way to broadcast it in 4:3 such as having the main characters stand within a 4:3 square of a 16:9 shot. They didn't have the foresight however to make sure they had backups. We'll never get to see it as originally intended and shot. The DVD was a mix of what footage they had left, mixed in with recovered-from-video footage.
1/ Slightly glossed over is the fact the servers are fine but the hard drives in them died. It was generally known that the noise from gas fire suppression systems can damage hard drives, but not sure it was well known. I certainly didn't know how extensively, percentage of HDD's that would be destroyed, this can cause.
A little bit more from C4 themselves.
https://www.channel4.com/press/news/whats-happened-access-services-channel-4
2/ "Red Bee Media got the pictures and audio up and running again quickly," , yes but the key thing is here not from the primary site but from a DR site.
They are all still on DR, see Channel 5 with the white and black square in the right hand corner, indicating they are *still* in DR playout. The issue is Channel 4's DR subtitle playout isn't working. I heard some concern that fixing this is dangerous whilst in DR (as they have no further DR if they break something). So changes to the Red Bee's DR being made are very limited.
"It was generally known that the noise from gas fire suppression systems can damage hard drives, but not sure it was well known."
Fairly well known to El Reg readers me thinks.
My knowledge of fire suppression systems is v outdated. Does anybody know if such a system would be equally effective if it discharged the gas slightly more slowly? Or does it need to discharge so rapidly to get high turbulence = effective mixing?
the white and black square in the right hand corner, indicating they are *still* in DR playout
Ah, thanks! I noticed this and assumed it was a signal for "more bleedin' adverts coming up", but when it stayed on all the time I guessed it wasn't. Nice to know though!
> It was generally known that the noise from gas fire suppression systems can damage hard drives, but not sure it was well known.
Yupe, well known indeed, e.g.:
Zurish Insurance: http://www.zurichservices.com/zsc/reel.nsf/b777a8062cedf191c12571fe00467717/6361b933b360dbac862584790079666a/$FILE/rt_fixed_fire_protection_data_centers_and_HDD_noise_sensitivity.pdf
3M (competitors to Halon systems): https://multimedia.3m.com/mws/media/1180481O/clean-agent-system-noise-hard-disk-drive-hdd-failure-faqs.pdf
I seem to remember reading an article somewhere (maybe even on TheReg) about someone (re)designing fire supression output baffles and also having design software to optimise their placement within DCs to eliminate the risk to HDDs. After a quick search all I can find (not the article I'm thinking of) is this:
https://solarfiresystems.com/news/silence-is-golden
Oof, this probably hasn't been a fun few weeks for the sys-admins, engineers, and various sundry dogsbodies at C4 and Red Bee.
We've all screwed up in prod before, but having your problems as publicly visible as a TV channel must be terrifying.
I'll save one of these for them when they finally get it fixed>>>>>
I had a trip up the BT tower a few years ago and met a fairly senior bloke from BT whilst there. He was very chatty and told me loads although some questions were left unanswered because of security etc. I didn't get to see the War Room in the basement but I did ask if I could. He told me that there's basically very little going out of the top of the tower these days. Useful as a place to impress people though and the view is indeed very good. There is the BT International Media Centre there and he said they carry a large amount of TV through there. He said if it had a problem you'd doubtless notice*, also said that they had redundancy, backups etc. Most interestingly to me he said if they were directed to do so by the government in a major emergency they could and would switch all of those over to a government feed. Whether that feed was coming from the "Defence Crisis Management Centre" or somewhere else he wouldn't say.
Then some people** did notice when the X Factor went off air. I wondered if they really did have backups and redundancy etc.
**Not me though.
Gawd knows how long Red Bee would take to respond & recover if given a four minute warning of nuclear attack.
There was an actual plan sort of involving commercial broadcasters for the Attack Warning Red message. Well the commercial broadcasting transmitters anyhow. This was when the transmitters were all owned by the Independent Broadcast Authority. I was told that........All the feeds to transmitters would be switched to a feed from the BBC and the warning message would be broadcast simultaneously. Commercial broadcasters had no part in the plans once it was confirmed that the missiles were on the way.
The codeword that would have been used to confirm and authorise the "Attack Warning Red" message was "Falsetto" It would have been sent by either the Director of the UK Warning and Monitoring Organisation (UKWMO) at RAF High Wycombe bunker or their Deputy at the UKWMO Bunker near Preston.
Yes, it's a £50 voucher to spend at Currys. But since there's no over the ait transmission for those affected, the best they can do is by a Chromecast or whatever dongle and hope their broadband is up to it. Bear in mind that the people most likely to still be affected, ie not within sight of the temporary mast or the smaller repeaters, are likely to be in very rural locations so less likely to have decent broadband. And the there's the folk, especially older folks relying on TV and Radio who may not have BB at all. You can;t get a Freesat kit for £50, let alone pay for installation.
At great expense, the last system I designed had a proper, offsite, in a different datacentre, DR platform. Whilst it needed a little work to point to a different DB, it was a complete replica of Prod. And it was tested. Yearly.
Getting it approved was a nightmare. "Oh, can't you just rebuild it?" "Of course, me and, ahem, which army?"
But it got built when I decided to turn my phone off one weekend when we had a small outage
So, I ask again, was DR ever tested? I'm guessing that's a big fat NO
When I were a mere lad working in broadcast TV, "Pebble Mill at One" was essentially a regular and frequent DR test. Pebble Mill in Birmingham was the alternative network control for the BBC if TV Centre went down in a sufficiently spectacular manner, and PM@1 was used to check things could be fed directly to the transmitter network without involving London.
Twelve or fifteen years ago, when the BBC rebuilt most of its radio playout systems, A studio had a local server which cached the audio from the apps room servers. The apps room had three servers using two different OSes; all these ran at once to deliver audio. In most cases there were two identical apps rooms, at opposite sides of a broadcast centre, on separate power and network systems and fed from external circuits that didn't share ducts...
I recall driving up the M1/M6 to Birmingham with *all* of Radio 2's music on a server in the back of a rented van; significantly faster than transferring it electronically (though a couple of years later we re-engineered the delivery backbones with 64Gb/s circuits).
I remember 'DR rehearsals' when we would set off from TVC in a load of cars for BM to try and create BBC1 and 2 from there. It was quite fun meeting up with colleagues we had not seen for months but I don't remember it ever going that smoothly trying to use a studio setup to playout TV channels.
And then BBC DR moved to Elstree.....and now it is either Salford or W12 depending on stuff.
Testing failover, or testing failover scenarios?
I worked for C4 back in the 00's when failover from the main broadcasting control in Horseferry Road to the DR at I think Old Ford Lock was tested regularly and always worked. The failover process was for someone at HFR to switch control to OFL (I picture a huge blade switch but probably just a little, red button). As soon as that was done you had 30min or so to get someone to the desk in OFL and queue up the next set of programmes.
What wasn't tested though were different failover scenarios, which resulted in the channel going off air for a good few hours despite a perfectly good, fully functional failover solution. From memory builders on the site next door cracked a gas main so HFR and the other surrounding buildings were told to evacuate, which HFR did by pulling the fire alarm. Assuming it was either another drill or false alarm everyone trooped out and stood on the street. And stood. And stood. And started looking nervously at watches and sent someone scurrying to OFL. And started asking the fire brigade if they could go back in and press the failover button (and being politely told to feck off). So stood while the programme queue ran out and someone sat at the desk in OFL watched the channel go off air unable to do anything about it because OFL couldn't TAKE control, it could only be GIVEN control.
Doh!
People rabbiting on about whether or not a fire suppression system would (or indeed should) damage hardware are missing the point. This is a classic case of a lack of risk management leading to an error in egg location. That is to say that there are a number of catastrophic failures that can occur to a given basket (fire, flooding and overzealous fire suppression being only three). As such it's always best to have some hot or even cold standby eggs in another basket located as far from the first basket as is practical.
A classic example of the eggs in basket problem was a fire in a data centre in London causing Plusnet to drop off the planet. Not only did the internet service disappear, but the helpline went dead as well, because all services were routed through the same data centre. They could not even put out a recorded message when you phoned up.
This post has been deleted by its author
Even if they have tested their DR they can't possibly test everything. If you don't have any hearing impaired people involved with the testing, no one is going to think about testing the subtitling.
You can be sure their ability to deliver ads during programming breaks was thoroughly tested though!
When I used to work for Ree Bee as a techy engineering type, C4's 'old' DR in Camden was quite basic and the most important thing was to ensure it had all the programmes and ads delivered to it (file transfers). We tested it each week but it could only provide a cut down version of full programming and it was all SD only (IIRC) - I don't remember if it ever got put to air. They built a new one a few years ago but I don't remember it being staffed as such and again I think it was pretty much just a 'backup'
The BBC (now) have a much better system with 2 playout centres which are both permanently in use - each doing a couple of channels but able to take the full load of all the channels if the other one keels over.
I remember being told a story that many years ago some of my IT colleagues visited a bank and were shown how they had 2 identical systems which they frequently switched from one to the other to ensure both were working. One thing I always hated was identical systems being labelled 'main' and 'reserve' - people seemed terrified of using reserves thinking they were not as good and so they would not get tested in anger. X and Y or similar are far better.