What is this Huawei of which you speak?
(AC because I work for an 'oligopolist')
O2 has fixed its poorly mobile network, so now everyone can start asking what went wrong and what the company is going to do about it. O2's press office isn't responding to queries. However, our understanding from various sources is that the day-long outage was caused by the transition of subscribers' details to Ericsson's …
The problem with outages like this is that while they might cost the operator a little cash and repetitional damage almost all the pain is felt by the end users.
I'd just love Ofcom to insist on DR via roaming on to other UK Networks. I can't see a big issue, just have other operators flip the switch on roaming when a network is down for a pre-defined time or pre-defined scale and oblige the faulty network to carry the roaming costs while they're down.
Seems like the perfect lever to drive up network resiliency to me - pass the cost of poor redundancy and network management onto the networks rather than users!
I fully agree with you - it is long overdue.
However, it is not as easy as it seems. There is a list of forbidden operators in most SIMMs which is used to reduce signalling load from rejections of "pesky competitor customers" roaming onto your network. This is amended further by the more "positive thinking" preferred operator list present in more modern devices as part of operator customization.
Updating these is a major undertaking and may require re-issuing SIMMs for customers which do not have a phone where the operator can manipulate the lists remotely. So it definitely will not be a "flip a switch". More like "painful 12 months of of network transition". It is doable though - EE showed it as possible.
It is well worth it for a completely different reason. It removes the last excuse for keeping copper voice as a "universal service obligation". This will do more for "Broadband Britain" than any government money because it will allow "data only" connectivity at a regulatory level. It is also one reason why it is not being done. Crow does not poke a crow in the eye - so do not expect Ofcom to chop the head off one of the last BT cash cows even if it is very good for the country. That is not the way things are done in the UK :(
It's be nice, but alas in this outage, the trouble wasn't connecting to the network, it was doing anything useful once that connection had been established. O2 would have had to take the entire mast network down in order for phones to roam to any other network (should Ofcom bend their ill-advised competition rules to allow it in such cases, as suggested by the OP) and that would have affected 100% of customers, not just a third... As a footnote, I'm fairly sure the nature of this issue will have meant emergency calls would not have routed via roaming as is usually the case. I hope no-one died.
You'd cause a cascade failure.
No operator is going to carry enough spare capacity to handle the totality of another network failing because it would pretty much double the cost of the network and so double the price of service to the customer.
Without that spare capacity, a failover would trigger failure on the network being used for DR - taking that network's customers down too. Both sets of customers fail over to network three, which again can't cope and then you're left with no networks functioning at all in a pretty short space of time.
It's far better to leave a set of users broken than to arse about with ill-thought out DR solutions that leave everyone with degraded or zero service.
When you say "I can't see a big issue" what you mean is "I've not really thought this through".
That isn't what my sources have said who work for OSS suppliers and major operators involved... Quoting:
"Straight from the O2 horses mouth, it looks like Huawei cocked up an upgrade to the CUDB which is the centralised clustered HLR that Ericsson sold to O2 about 2 years ago. Quote "talked to some guys in Guildford and they told him that Huawei tried to do an update on the CUDB and they didnt configure on the cluster which is the master and which is the slave.""
Now, we know that this can always go wrong, look at RBS' problems recently, but a cluster (certainly all the ones I've worked on) will work out itself which nodes are the active and inactive, it's kind of the point of a cluster. I think they may have separated sites without ejecting the DR site's nodes from a geographically spanned cluster resulting in a split brain scenario.
If the HLR or its equivalent function is seriously stuck up a creak looking for a paddle the chance of roaming is likely to be hit hard. I doubt that there would be an option to roam to another network without a good HLR function. If you cannot prove a SIM is active and that the handset is good, how can you manage to confirm the billing authority and allow a billing event?
I guess that's a risk the offending network would have to be obliged to take - like when the banks had to take the hit for Cheque Card misuse...
Is the SIM O2? Check
Are O2 on the hook for costs? Check
Off you go lads use some data, make some calls, O2 are picking up the tab!
Maybe needs a bit of front end infrastructure though to automatically bypass the HLR and route numbers to the other networks VLR?
I'm expecting a torrent of up votes on this one but it does need saying.
These two outages really do show the false economy of tech headcount cuts and it feels as though the rubicon has been crossed.
Quite how any large business can get more and more reliant on IS and at the same time cut and cut at support and investment astounds me.
But then I'm not a CFO of a large multinational business so what do I know?
Is it a false economy though? You'd have to compare the revenue lost from the service outage with the savings made that led to the failure - if it was down to a cost saving exercise.
Businesses don't cut costs for a laugh, or even for management bonuses, they do it because the market drives prices ever lower and if you can't cut your costs to reach the market rate you lose all your customers. All the time people by mobile services based purely on price, this will be the result.
Not only is it reasonable, it's the minimum you can expect.
Agreements for sale of goods and services require that said things are actually provided! This particular parrot, i.e. the 18 hours+ of parrot which didn't arrive, isn't so much an ex- as a non-parrot!
What got me is the article mis-stating "ordinary customers will demand compensation, but O2 has no obligation to provide any."
Indeed no obligation exists by definition, because individual consumers of large companies' products tend not to pre-emptively negotiate damages clauses into their own contracts!
Now, that an obligation could materialise as a result of a successful CLAIM for damages against O2, that's far more likely - albeit something determinable only on the merits of any individual claim. The power of O2's disclaimer's of liability in the contract is debatable: it's possible these would be void clauses due to falling foul of the rules in the Unfair Terms in Consumer Contracts Regulations and Unfair Contracts Act, although, it depends on the merits of any individual claim.
Well it's all doom and gloom on here, eh? Why is everyone so quick to point the finger at cost savings/off-shoring/out sourcing when the reality is the cause of the outage is not yet public knowledge?
Do people actually understand what happens when the network management is outsourced? O2's network management may be done by Huawei, but that just means the engineer that used to work for O2 now works for Huawei. Same bloke, different employer. He's not become some incompetent, johnny-foreigner overnight.
And while everyone's moaning about "single-point-of-failure" and "not enough redundancy" etc, maybe they should consider the fact that mobile network's are incredibly large and complex beasts. While I'm in no way excusing what's happened, the fact that it's so rare is a testament to the way these networks are built and managed.
i remember the "single point of failure" complaints made when Vodafone had a similar outage last year.
I will repeat what one person said....
if you are that concerned about single point of failure, why aren't you carrying around an O2 and an Orange phone then?
network down for a day = a day of getting work done.
Dual-Sim mobile phone. They are readily available and can work on entirely seperate phone networks if the best possible resilience is required. Having dual sim cards also has the advantage that if one network is providing poor coverage in the area you are in, the second network might possibly give a usable signal to allow for making/receiving calls.
Been on O2 pre-pay for about 6 years, never really had a major problem, even this last week.
I have always considered mobile communications a privilege, not a right.
It's crazy what a few people do without a mobile for a day, I wonder what they would do if T.V. broadcasting went down for a week? (Not owned a TV for 6 years now)
If my leccy/Internet went down for more than an hour? I have a few months of unread books to keep me satisfied.
@Richard 12 - you don't pay for them to be delivered 24x7x365 with zero downtime. Even 99.999% guaranteed service would cost you a lot more than the mobile networks charge.
Do you demand a refund from your standing charges on lecky when you get a power cut? Even for something as critical as that, it needs to be out for 18 hours before you get anything.
Do i demand a refund on my leccy in the event of a blackout? Yes, yes i do, especially if i lose a freezer full of food becuase of the sh*tty infrastructure my leccy provider has in place :/
In a slight tangent; PlayStation only offered what they did because if a mass lawsuit was brought against them it would ruin them. Therefore O2 should technically refund it's users for the time the network was unavailable. They refunded me the cost of some unsent SMS messages a few months ago when the SMS portion of the network failed. I didn't even make a fuss because I was unaware it had happened :/ What's different about this situation?
@takuhii - that's what insurance is for. Is a tree falling on your power line "sh*tty infrastructure"? Do you really expect fully redundant paths of leccy into your house? Can you claim the same refund if you happen to be running a cryogenics bay in your house?
As for "what's different", all your examples were voluntary gestures of good will from the suppliers. They are obliged to do next to jack, other than a refund of the money you spent (but usually after a fixed amount - broadband is 5 days before contractual refunds kick in for example).
If you want to add in requirements for damages to a contract, you have to add it in up front, largely in order for the supplier to cost the contract accordingly. That's contract law and it exists for a reason.
My company has just spent millions kitting out most of our 8,000 frontline staff with new iPhones on the O2 network, only to find they couldn't communicate with any of them. Many of the staff who got these new phones with limited personal use, binned their own personal contracts. Cue dozens of Tech's acting like their left nut had just been cut off.
I shouldn't have gloated to them, but my Orange handset was working fine.
@Andre Carneiro - indeed, although I do love the inevitable schadenfreude that seems to inhabit some people as if they made a reasoned decision to avoid O2 and choose Orange/Vodafone based on some glorious insight. You know, like they researched all the major telcos' internal audit procedures and predicted who would be the most resilient provider.