
Single point of failure?
I have a colleague who used to work at BA. From what he told us that they use an Enterprise System Bus (ESB) which is a Java base message queue system which can be a publisher/subscriber or point-to-point configuration depending on the messaging needs.
And from experience, this ESB sometimes is a pain trying to get it started, even after a scheduled power cycle by experienced people. Plus the fact that if you want the fastest possible message response, the messages are stored in memory with no physical backup; you can use a database as the message queues but that slows things down a lot.
I am guessing that this piece of software failed to restart properly and since it was configured for maximum speed to handle the load, all the messages before the crash was lost.
Just a wild guess.