Who run the Royal Mail website I wonder
Ah, the favicon on royalmail.com is a bit of a giveaway...
Royal Mail’s three main websites have been unavailable since last night, forcing millions of customers to get used to the taste of glue while guessing where their registered post might be. Blighty’s national postal service currently has no online facility. Instead, customers attempting to gain web access to Royal Mail are …
i was dreading using the phone to trace a parcel - i figured i'd be on hold for hours - but it only took 5 minutes of "please press 1 now, otherwise hold the line" before i got to speak to someone
they weren't able to find my parcel either, but that's besides the point - the phone did prove to be a viable fallback device
"A spokesman told us that Royal Mail tech bods are investigating the problem, but was extremely vague when asked to provide further details"... well it's like a sort of problem thingie and it's proving to be problematic to us because, err... well... it's causing a bit of a problem, especially to our, um, problem solvers.
<wonders> did anyone check to see if the cleaners put the plug back in?</wonders>
@ KarlTh
the way it works is, the primary may be big and beefy, but the secondary is less than 50% the spec on everything, starting with memory. the accountants always want a word when that failover box is purchased, as an idle system already gets their thongs in a knot, and a well-specced one is simply inconceivable (credit: Princess Bride). all one can do, as the original system scales up to 90% or better utilization, is:
[1] put forward a funding request for an adequate failover system every year (it is, of course, rejected, or deferred until next year), and/or
[2] bring up a JBOC* (built from spare parts and scavenged components) tertiary that would take some of the load, but is unproven, unsupported, unauthorized, and off the books (management will force IT to ditch it if they find it, and the persons responsible may lose their jobs), and/or
[3] pray (this will not work, but makes everybody feel better, until the primary "falls over and catches on fire").
odds are, CSC are replacing the failed components in the primary, and possibly doing a bare-metal restore (assuming they have a backup). it is also possible that the primary's UPS had a catastrophic failure (power "flapping", or similar), that may have hosed the primary system quite thoroughly, without actually baking anything. in any case, it all sounds exceedingly unpleasant, and the setup was probably planned with a good measure of delusional optimism.
to conclude, this happens, far too often, and is in most cases a management failure, but occasionally a design issue, and very rarely a technical fault (assuming that equipment has a finite lifespan, which all equipment does, component failure is part of the normal scope of events).
*Just a Box Of Components
"Apologies, we were unable to deliver this webpage to you today. As a result the webteam will have stored it overnight in the boot of their car and will return it to the sorting office sometime tommorrow. Please collect it from there after 48 hours ... please bring a form of identification for you for the IP address to which the page was meant to be delivered"
If this is a failure of a piece of infrastructure equipment, then as far as I'm concerned the muppets get what they deserve. How stupid to have a single point of failure in their infrastructure. But what have we come to expect over the years from Royal Mail..incompetence.
Doesn't look as if they have heard of fault tolerance does it?
And as for secondary systems only being half the spec of the primary..kinda defeats the point doesn't it, one would expect the back-up system to provide the same level of service, which obviously means the same spec equipment!
@ KarlTh
you underestimate our fine friends from finance. the secondary is specced at less than 50% on EVERYTHING, including disk space (for which they may have bought a system with 1 big hard drive, if the purchasing is done by the purchasing department and not by IT - not a joke and i've seen this happen). the application may not start, and if the disk got full from replication, the box will hang on boot (if it is a Windows box).
also, depending on what OS it's running, the server may simply drop requests under high load (MS Server 2003 did that when running a commercial Java app server, in a test i've seen).
to conclude, if the secondary is substantially inadequate, it may not boot, or may not be usable if it comes up.
I bet this wouldnt happen on the Deutsche Post's website.
They even have a version of the website in English, not bad for johnny foreigner! (probably a cunning trick to discredit the Her Majestys Finest)
just shows you empires are not built on gritty determination, ruthless efficiency and razor sharp tailoring, but by fate, luck,stiff upper lips and putting on a jolly good show?
The speed of response of the Royal Mail site has been attrocious recently.
Three minutes to get the second class postage page, fifteen minutes and still no response from the print-your-own postage log-on pages.
As far as I can see there's nowhere on their site to complain about the website - just late deliveries etc. How glad I was when up popped a "Take our Survey" and I was able to feed back...
Also suprised that the Royal Mail traceroute goes from UK to New York to Holland - what no UK servers
So maybe they've taken it down to add a second processor and a few Mb more memory