Navitaire isolated the failure to the device in question quite quickly, but the decision to repair the device proved "less than fruitful and [it] also contributed to the delay in initiating a cutover to a contingency hardware platform." A failover process that should take around 90 minutes took the best part of a day.
There's your problem right there. Okay, hardware failures happen, we know this, that's why you build a failover / warm standby / whatever environment so that WHEN the completely unexpected unplannable for failure happens it's not a complete disaster.
Given that you've already got the failover environment, when the dreaded happens, USE IT! No point having it if you don't use it. Muppets!
(speaking from experience when our live storage decided to wipe all its config... 2 hours in we decided that a repair would take too long and switched all services to the DR site).