Having just wasted most of the evening skimming through all 262 pages....
They had two data centers in an “active/active” configuration, which neither Sabadel (the Spanish parent bank) nor IBM UK had done before.
ATMs had moved over to the this new infrastructure early, so during the load testing they didn’t want to push it until it fell over because that would break the live ATMs. So for load testing they used a configuration where the tests ran against one data center while the other served the ATMs.
And the transactions they used for these tests were read-only, because it was live data.
Their target for this testing was only 100% of normal peak load, i.e. no margin for growth or miscalculation.
So in real use the performance was inadequate.
Code quality was poor. Not much detail. about that, but it sounds like functional testing was poor and anything that wasn’t tested had no chance.
Documentation was poor esp. for architecture/infrastructure. There was a lack of configuration management e.g. duplicated IP addresses.
Branches used remote desktops (Citrix) and had a memory leak which caused them to crash frequently. They had not been tested for a whole 8-hour working day.
The real underlying cause seems to be that TSB (uk) was not treating Sabis as an external provider, as they should have been. The TSB CIO had been working at Sabis previously, and after moving to TSB he knew more about this new system than anyone left at Sabis. So he was conflicted between being customer and supplier. Maybe the TSB board should have been more circumspect and asked harder questions but not really, there is only so much you can expect them to do; in particular they were not told how different this was from what Sabadell had done when acquiring smaller Spanish banks.