Re: "but the code path that failed was never exercised during this rollout"
100% code coverage on all code is unrealistic and generally not worth the expense. Obviously this should have been tested. More importantly it should have had a flag that could be simply turned off to mitigate the problem in minutes rather than hours.
Planning for failure is much better than assuming you can prevent all failures from ever occurring. Of course you should test as much as possible, but also assume it'll fail anyway and have a plan around that.