Who here didn't know Capita is indeed a single point of failure?
'Major incident' at Capita data centre: Multiple services still knackered
A major outage at a Capita data centre has knocked out multiple services for customers – including a number of councils' online services – for the last 36 hours. Some of the sites affected include the NHS Business Services Authority, which apologised on its website for the continuing disruption and said it hoped its systems …
COMMENTS
-
-
-
Saturday 27th May 2017 07:18 GMT Anonymous Coward
Still at least the weather is perfect!
For meatsacks not inside a building, yes. But an interesting thought is that summer is now a time of real grid instability, because all that essentially unplanned solar PV dumped on the grid causes huge instability. Varying output (both predictable and not), asynchronous supply, lack of system inertia, all of these cause network and transmission problems. The hippies may b e rejoicing when there's a "no coal" day, but the system operators are sweating, I can assure you.
And those network stability problems don't need to be absolute failures - just sufficient to push a particular line or substation out of tolerance and trip a breaker, and Bingo! Then you get the knock on effects. I can't say that had any bearing on Crapita's problems, but its a big deal that worries the network operators.
-
Saturday 27th May 2017 08:02 GMT Anonymous Coward
"lack of system inertia ... a big deal that worries the network operators."
Doesn't seem to worry anybody in the UK power industry enough to actually *do* much about it (e.g. invest in robustness). Competing privatised stovepipes is not an obvious way to encourage proper joined up thinking and consideration of the bigger picture - but who knew that?
Anyway, it's 2017. System inertia, for example, doesn't just come from large lumps of rotating mass. It can come from "synthetic inertia" based on modern high performance power electronics, which achieve the same result as the rotating mass but do it more flexibly, via digital control mechanisms.
Companies like ABB have, not surprisingly, been doing this at grid scale for a few years now. [In principle GEC might have had a go too, if they hadn't gone bust almost two decades ago, having made a strategic decision to put money in the bank rather than to invest in products and people and technologies.]
See e.g. this handy summary of synthetic inertia in general:
http://www.ee.co.za/article/synthetic-inertia-grids-high-renewable-energy-content.html
and/or for some rather more detailed analysis with a specific focus on wind, there's e.g.
http://elforsk.se/Rapporter/?download=report&rid=13_02_
The UK have largely been ignoring these options, preferring to whinge ("insufficient inertia") rather than invest. It's so much more profitable to continue relying on 1960s miracles of engineering such as Dinorwig's fast response pumped storage, and to build relatively quick response diesel generator farms around the country. But other options are available, though some may require people to "think different" and worse still some of the other options may have a short term negative effect on corporate financial results. And apparently that's not allowed.
[more in a moment]
-
Saturday 27th May 2017 08:04 GMT Anonymous Coward
Re: "lack of system inertia ... a big deal that worries the network operators."
[continued]
Then again, maybe this (from 2016) is a better late than never sign of better things to come in the UK:
http://uk.reuters.com/article/national-grid-battery-idUKL8N1B72XQ
"Aug 26 EDF Renewables, Vattenfall and Eon were among seven companies which won four-year contracts with Britain's National Grid to supply super fast balancing services, National Grid, said on Friday.
The contracts are the first Britain's power grid operator has awarded to battery storage technology, and were worth at total of 66 million pounds.
National Grid needs to balance electricity supply and demand on the grid on a second-by-second basis to make sure the system runs efficiently.
A total of 201 megawatts (MW) of capacity -- roughly the same amount as produced by a small power station -- was secured from seven companies at eight different sites, with the earliest contract starting in October 2017 and the latest in March 2018.
The amount each company was awarded depended on the amount of capacity offered and how long it would be available for.
[continues]".
-
-
-
-
Friday 26th May 2017 13:26 GMT GingerOne
How are a company as big as Capita relient on ONE datacentre? Even forgetting their myriad of other failings surely this is reason enough for all of their customers to jump ship and for no one ever to employ their services again.
I just cannot beleive this. Literally day 1, week 1, IT basics - make it fucking resillient!
-
Friday 26th May 2017 13:58 GMT Anonymous Coward
"make it flipping resillient!"
Why would the people in charge want to make it resilient? It'll eat into those people's bonuses, surely?
Until the impact of failure directly hits the pockets of the people in charge, and has a bigger impact than the cost of failure when it happens, those people have no motivation to build resilient systems.
This isn't the 1990s any more you know, when IT people built systems resiliently **because it was the right thing to do for the customer**, and if you were good as a designer a system that provided critical functions in a degraded mode in the presence of partial failures wasn't always that much more expensive (in $$$$) than a basic setup straight from the box-shifters stocklist.
Those days are long gone. When did you last read a news item relating to (e.g.) Tandem NonStop, or other high availability technology or techniques? Devops, yes. Kodi, yes. Drones, yes. Resilient systems? Pointers welcome.
-
-
Friday 26th May 2017 14:00 GMT Anonymous Coward
Uh-oh!
"Good afternoon, my name is Steve in Mumbai. I see that the fault you have reported is complete loss of data centre and failure of DR. I am here to help you with your complete loss of data centre and failure of DR. May I ask you first, have you tried turning your computer off an on again?"
-
-
Friday 26th May 2017 15:35 GMT GingerOne
Is my place of work an anomoly? We don't have DR because we have a resillient always-on sytem with our own private cloud. I just don't understand why the beancounters in these places don't understand. Yes, good IT costs money, but guess what - it's worth it when shit goes wrong.
If we lost a datacentre it would be a big worry for the infrastructure team and the rest of us in IT because our resilliency would be affected but the general userbase would carry on working as normal, non the wiser to any problems.
-
Saturday 27th May 2017 08:48 GMT easytoby
It's an anomaly in comparison to NHS and many public sector and large charity situations. Here the knowledge in the customer organisation to specify and enforce appropriate contracts is missing. Also missing in many cases is the leadership strength to demand proper action on 'difficult' situations.
-
Saturday 27th May 2017 16:58 GMT Terry 6
Part of the problem is that the bean counters demanding the (illusory) cost savings that lead to outsourcing all sort of services also refuse to pay for/retain the staff that can keep control of it. i.e. You don't just get rid of the school meals service, the cleaners or the payroll etc. you also get rid of the staff from those departments who know what is needed, and how it should be run. In fact, since the options for front-line staff savings are often not that great those supervisory and middle manager staff are the jam on the toast that helps to make the outsourcing costs seem to add up. And middle managers are always seen as a fair target, whereas the top brass on huge salaries always seem to survive.
(And no, I'm not a middle manager, but I've seen how they and senior front-line staff can make so much difference.)
-
-
-
This post has been deleted by its author
-
Friday 26th May 2017 18:56 GMT Shareholder
System failure
Sys failure caused by incompetant directors, caused by a fourth rate HR section that can only select staff by looking at a bit of paper - not on ability. See what can happen!! Have read enough reports showing bad choices. 90% should be removed immediately, before customers leave.
-
Friday 26th May 2017 19:44 GMT Terry 6
It is an eternal mystery
Capita/G4S/whoever can hit the headlines for all the wrong reasons. Do all the potential clients run away from them as fast as they can possibly go? Or do they continue to line up and buy more?
What would you expect to happen - and what does happen.
It seems as though when you get big enough no amount of incompetence and failure can be enough to bring you down.
-
Saturday 27th May 2017 07:27 GMT Anonymous Coward
Re: It is an eternal mystery
"when you get big enough no amount of incompetence and failure can be enough to bring you down"
The concept of too big to fail was pioneered by the banks with great success. I think that other sectors saw the financial crisis, and said "we'd like a piece of that". So Crapita have made themselves a de-facto part of the public sector and too large to be allowed to fail. But not just them. You might argue that there's alternatives to Google, and that Facebook is an unnecessary frippery. But would the US government really let those huge and convenient spying machines collapse if push came to shove?
As another poster comments, the public sector customers ought to be able to nail Andy Parker's scrotum to a gate post, but won't because they are poor at agreeing contracts, poor at interpreting contracts, and worse at holding big suppliers to account. In fairness, the OP didn't mention Fat Andy's knackersack, but the general drift was there.
-
Saturday 27th May 2017 15:06 GMT Anonymous Coward
Re: It is an eternal mystery
well let's wait an see... they have the whole of the bank holiday weekend to cobble together some kind of solution... if they're not back by Tuesday surely someone will start to ask some serious questions about the outsourcing culture that we've adopted via stealth campaigns over many years... this could become a very hot political potato.
-
Saturday 27th May 2017 18:34 GMT Anonymous Coward
Re: It is an eternal mystery
surely someone will start to ask some serious questions about the outsourcing culture
What's that?
Rocking the boat that's floated by the extreme capital investment leverage bought hy putting customers' balls 5 cm over the asphalt at 110 miles/h?
Not going to happen if people with share options can pretend to be the one company which exploits IT with efficiency that cannot be found anywhere else on the planet.
-
-
-
Friday 26th May 2017 20:47 GMT fruitoftheloon
the other data centre...
I left Capita 10 yrs, ago, we had a v v important internal system in West Malling and a 'warm' Dr standby in the other data centre.
We did a real fail-over test (ironically) on my last day, it worked fine...
I wonder if some of those afflicted by this fsck up haven't been paying for a warm/hot DR, if not, TOUGH SHIT!!!
-
Friday 26th May 2017 22:48 GMT Anonymous Coward
Just like Pigs...
... Capita parts don't fly!
The anonymous customer gave Capita undue credit when he said "They have probably had to fly parts in from out of the country as the infrastructure is so old."
Were parts needed for this outage (seems unlikely) then I can categorically say that Capita will use the cheapest means possible to ship them - usually next day courier as immediate couriers are considered too expensive and needs 2 manager approvals. This itself causes untold delays because 1) managers are rarely available 2) bonuses could take a hit so extreme reluctance to authorise persists.
Also, why should they worry when they're not the ones hurting with system outages when so often the pain is carried by their customers? Generally the take is that if the customer was stupid enough to take out a contract without service penalties then there is no need for them to pull their finger out. When parts are needed the first question (before what part do we need?) is "Are there service penalties?"
-
Saturday 27th May 2017 05:51 GMT amanfromMars 1
The Revolution will be Virtualised
Clouds Hosting Advanced Operating Systems in Chaos and Melting Down. Well, well, well ...... Who'd have a'thunk it ...... a Cyber FCUKishima in Dumb Servering Systems.
And to think that such is only the Start of the Beginning
of All that is Planned. Or would you like to think and disagree? -
Saturday 27th May 2017 07:45 GMT Anonymous Coward
The way to a grand upgrade of the DC's hardware appeared to be not very hard-to-find... and the shareholders would finally welcome this long-awaited opportunity to invest in the stability of their own future income...
Just a power fault, not the value service infrastructure (-:
What do you think would be the lower bid and how long will it stay on bottom after *this*?
-
Saturday 27th May 2017 15:02 GMT handleoclast
British Airways
BA has suffered a "major IT systems failure" that is affecting its global operations.
Coincidence or another Crapita customer?
This one is resulting in catastrophic disruption. Lots of delays and cancellations. On a holiday weekend, one of their busiest times. Gonna be a lot of compensation paid out to very unhappy passengers.
If (it's a big if, I'm guessing here) BA's IT was outsourced to Crapita, BA is going to demand major compensation from Crapita. Council claims for compensation would be trivial compared to this. So if that's the case, and you have shares in Crapita, now would be a very good time to sell them.
Again, let me emphasise, I'm guessing. Could be no more than coincidence.
-
Saturday 27th May 2017 15:54 GMT Anonymous Coward
Re: British Airways
I feel sorry for the passengers but not Bastard Airways, serves them right for outsourcing to India :
"The GMB union says this meltdown could have been avoided if BA hadn't made hundreds of IT staff redundant and outsourced their jobs to India at the end of last year."
Source:
http://www.bbc.co.uk/news/uk-40069865
-