
Has this ever happened to AWS or Google?
Rackspace has not offered any explanation of the "security incident" that has taken out its hosted Exchange environment and led the company to predict multiple days of downtime before restoration. In response to inquiries from The Register, Rackspace said its incident status page and an FAQ provided to customers are all it can …
>>Microsoft 354 generally falls over for a few hours to a day at a time
To be fair to Microsoft (M$) and Office 365 I have rarely experienced a complete wipe out of service, and the perpetual advisories have had no impact on my organisation.
However, I can't recall there being much more than a day without advisories such as :-
"Some admins' DLP policies may intermittently not be applied as expected to files in SharePoint Online"
"Users' email list downloads via Threat Explorer may fail to download"
and
"Teams Only users’ calls placed to Skype for Business hybrid users are dropping after 30 seconds"
Which seem minor to me, because i am not directly affected; I imagine the last one might cause real problems for some. How do they break part of a service?
M$ seem to have elevated borkage to the next level (12? after all 11 is just level 10 with a pretty face)
I'm not sure anything will save this whale.
I've no idea the percentage of their customers are affected. The stock has dropped from ~$14 to ~$5 YTD. A steady decline.
Will those impacted return? I imagine the non-techies won't, as they've been sold hand holding to find none. I imagine techies wouldn't because they can grasp the shambolic response.
For non-technical users? Definitely not.
For technical users? If you already have your own 24x7 NOC, and can build up sufficient expertise in running the platform, then maybe. It's your call whether the cost is justified, and whether your own people can do a better job.
But I think you'd probably be better finding a more competent hosting company. The big three (AWS, Google, Microsoft) have the best possible operations teams, the best possible disaster recovery preparations, and the best possible security and incident management. The smaller ones? Not so much.
Non-technical users rely on outsourced techies, on-prem or off.
Off-prem is cheaper, but business continuity is at risk (to a greater degree). As is evident through this incident.
If the incident were on-prem, the techies would be telling their client the details. Off-prem....
Can <your> business afford <unknown> days of incapacity?
Since they laid off all the long term techies and moved their customer and tech support to Asia, it has, speaking as a RS customer, been a true omnishambles. We only use them for basic services now thankfully. I would really not like to be reliant on RS these days for anything serious.
Even back a decade ago the "fanatical support" took a long time to arrive. Remember well a server being supplied that had the wrong OS installed on it. Took a day to get that fixed. Then database troubles that UK support couldn't sort, so we had to wait for the US support team to wake up.
All felt like a company that was too big for it's own organisational abilities.
You really find out what a service supplier is like when something goes wrong.
Whilst the incident is being investigated, they don't have all the info to be able to disclose fully what happened. The fact they've shut down/isolated the service suggests a compromised system which is 100% the right, first course of action that should be taken in these types of scenarios.
Having had the unfortunate experience of having to recover an exchange 5.5 server that experienced a failed RAID card on Christmas day many moons ago and despite having regularly tested restores, it took a colleague and I ~24hours to restore one instance, including much mucking about with eseutil.
Press 'F' to pay respects to the engineers dealing with this mess right now, hope they figure out how to restore services to customers soon.
Fair comment.
I would add that anyone relying on similar infrastructure might like to take note of this policy. If the service they subscribe to encounters similar problems then they should expect that this will be the level of response they are likely to encounter, and to review whether this is acceptable business practice for them.
I think you'll find there are techs behind the scenes working their arses off, poor sods on help lines getting all sorts of crap from (legitimately) irate customers and press people being told what to say or fend off those who ask the wrong questions. Meanwhile there's management actively failing to disclose to customers what the problem is, whether their data is safe and when service is expected to be resumed. Even if this was a technical issue which could not have been forseen, it smacks of complete management mishandling.
The best was after I spent 7 hours on Sunday getting another back end email server spun up to successfully regain email was seeing today a Mea Culpa EMAIL that was sent out from Rackspace on Sunday evening apologizing and assuring customers blah, blah, blah .... an EMAIL from the CEO "sent" to customers who have no EMAIL service!!!!!!
Oh yeah, when I worked at the university here (this was in 1999 so no smart phone to save the day..), ITS ("IT Services") decided one day to cut off the internet service to our building. I called in to ITS's service line.. have them turn our internet service back on... to which they suggested they have a web form to request service calls (this of course is a form only accessible from within the university's network). I pointed out, our internet has been shut off, which flew right over their head, they suggested the web form again. So I directly explained, with no internet service I can not load any web pages. They suggested e-mailing somebody next, so I had to then point out no internet means no e-mail either.
ITS was in fact bad enough that the computer science department paid for their own internet connection just to avoid having to interconnect to the university network and so have ITS think they could come in and fiddle with stuff in the building (after a few years of this, ITS came to an agreement where they provided service to the building but could not touch anything past the "outside connection" network switch.)
I know why they are having so much trouble; Exchange is unnecessarily complex, certain options are poorly documented or (even worse!) the documentation does not match the actual behavior of the software. It *shouldn't* be that complex, other than sendmail's actual config files being inscrutable (but also not having to be changed for internet e-mail), running an e-mail server is simple, there's a clear way to add spam filtering to it, there's a clear way to have multiple systems to handle the incoming e-mail, and a choice of webmail systems to sit on top of it all. But it is, Exchange was overcomplicated in the 1990s and from what I've heard it has added more features and options on top much faster than it's removed obsolete options and functionality. I'm sure (since Microsoft would like you to buy e-mail service from them) that large-scale-deployment is now one of those parts where the documentation is particularly poor. Microsoft 365's e-mail (which is essentially also hosted exchange) is really not any better, but Microsoft has to deal with the maintenance...
I provide support for someone who (based on a previous IT admin's advice) went with Microsoft 365 e-mail, and that is complicated too, and oddly slow at times. (The recent encounter I had with this, they just wanted 1 old e-mail address decommissioned as a real e-mail address, just forward it's e-mail to another e-mail address run under the same Microsoft 365 account just in case anyone still sent anything to it. They made the change through the web interface, made a test e-mail which did not go through, so they thought they'd screwed something up. I thought they probably had too. Oh no, after finding the docs both confusing and useless (especially since, without version numbers, loads of docs are for "versions" of 365 from like 5 years ago, and the menus have all been moved around between then and now...)... after plenty of looking, I finally find people commenting this thing that it seems like should apply within a matter of seconds actually takes sometimes hours to kick in. That was it, later in the day the e-mail forwarding worked!) Keep in mind, something as trivial as that and it's not clearly and conspicuously documented (I would put it write on the settings page!) a note that the setting may take 30 minutes to x hours (however many it usually can take) to apply.
"After 4hrs of hold time and 3 1/2hrs of tech time still no mail ending up in my new 365 mailbox unless I send it to myself???"
Consider yourself lucky. I just had my callback from Rackspace support after waiting since Friday (today is Tues) only to find it was a robocall and there was no one at the other end.
:-(
The stupidest thing is, it's not even an Exchange issue I need their help with!
After setting up your new mailbox, you have to wait up to 24-48 hours for the changes to propagate across the Internet. Patience you must have, my young padawan.
“ After setting up your new mailbox, you have to wait up to 24-48 hours for the changes to propagate across the Internet. ”
I spun up my own M365 instance, because after this I’m done with Rackspace. The DNS has been the absolute worst thing about this entire nightmare. Normally, when I’m changing mailservers I’ll adjust the DNS TTL from the normal 48 hours to 5 minutes a week ahead of time and do the cutover & mail transfer when I know the risk of in-flight email being misrouted is very low.
Because of this shitshow, I’ve no idea how many emails have been lost whilst I waited for the update to propagate. It’s now a three days afterwards , and mail flow looks about right.
Reconstruction of email archives has been problematic, but I’m almost there.
Patience is indeed the key here, and understanding that we are now minimizing the data loss, as opposed to a successful migration.
We use Rackspace and absolutely hate calling in problems. Nobody in our office can understand their techs with heavy India accent. Recently, numerous calls from India people asking to take a survey, or to respond to a spam ticket. Indians would call me every frickin day from a San Antonio number. Last week had an English speaking product manager email me about our services and I told him to tell his associates to leave me the hell alone and stop calling me literally all the time.
I run a team that does a few Exchange on-premise migrations to Office 365 a year. They know their s**t and have a good relationship with MS' Exchange support engineers. I can assure you that no on-premise to cloud migration is easy, simple or quick. If the on-premise Exchange hasn't been well maintained, it can take a week or more of engineering time just to get the system into a good enough state to consider migrating mailboxes. This is for a tiny Exchange system - say less than 100 users. At the scale Rackspace is working, things are a different league. Even if the Exchange is well maintained, it can still take time to prep the system to start migrating mailboxes.
Once your on-premise Exchange is ready and you start migrating mailboxes you'll hit Office 365 rate limiting. This is a real b***h and there are no work arounds: No amount of pleading to Microsoft will get these rate limits raised for the period of your migration. This will be killing Rackspace right now: Hence their suggestion to just forward email.
And let's not forget the features that either don't exist in Office 365 Exchange or work differently or the bits of Exchange data that an official MS migration won't copy to the cloud.
On-premise Exchange's days have been numbered for several years. I bet Rackspace thought they could make money for old rope by sweating a hosted Exchanger service. That "easy money" has just turned into a PR disaster.
"On-premise Exchange's days have been numbered for several years"
What are the alternatives?
I support a small business with half a dozen mailboxes on on-premises Exchange 2019. The business owner has an aversion to anything cloud.
Requirements are emails foremost with contacts and calendar functionality across multiple devices close behind.
I took over from a different IT support provider and 2020 was the first time I built and configured a Windows server with 3x VMs: PDC, Exchange and an Application server to take over from their old setup that included Exchange 2010.
" I support a small business with half a dozen mailboxes on on-premises Exchange 2019. The business owner has an aversion to anything cloud.
"On-premise Exchange's days have been numbered for several years"
What are the alternatives? "
I can provide one. It's called a Linux box running Sendmail with a couple of my own milters.
And what have you done about all of the non-email functionality Exchange brings along?
For better or worse businesses go with Exchange in large part because it does that other stuff "oh and it also does email".
Linux emulations and replacements exist, but they're not as cohesive nor do they interact quite as well with Outlook.
IANAL - Genuine Q
As I understand, these acts say: if the supplier cannot resolve the issue (within reasonable parameters), consumer(s) may go to another party to resolve the issue. (2nd party gets the first shot at resolution). Then the original supplier (2nd pty) foots the bill.
Does this apply here?
I'd guess there's no point in trying to get a US company to reimburse a UK company 1v1 (jurisdiction in this case), the question is about the law (UK). And then, class action (US) to get them taken down?
The origin of my question is the (protection of) non-techies that are mid-shaft, paying for someone else to clean up on SpackRace's clusterfuck.
Update: shortly after posting this El Reg provided the latest episode in the series, which has further been updated to include Stephenson, et al. v. Rackspace Technology.
Thanks Vultures!