It's the HLR
I happen to know that T-Mobile are upgrading their HLRs around about now, and it sounds like this is what caused the problem. Having a backup close to hand sounds a lot like -- we migrated, it was fecked, we rolled back.
There's not really much you can do in these instances. HLRs are a bit old skool (at least the ones T-Mobile were replacing) and are designed to be highly available, highly resilient in their own right. But, if you're swapping from one to another -- there's always a chance things can go wrong.
To be honest, it seems like they did a pretty good job of containing the issue.
@Danny: SQL Server? Are you having a laugh. That's _certainly_ not carrier grade. If the HLRs were SQL Server based, you'd never connect a call!
@Yorkshirepudding: that's really just physics, and not much T-Mobile can do about it ("ya cannae change tha laws of physics, cap'n, etc. etc.). T-Mobile runs at 1800MHz, while O2 (and Voda) are on 900Mhz. The low frequency has greater penetration. Hence O2 and Voda customers can use their phones where Orange and T-Mobile cutomers can't. As far as I'm aware, all networks are using 2100Hz 3G, so are all as fecked as one another in that area.
Paris, 'cos we all know when she goes down.