Network failures
In hindsight, this was completely predictable
Doesn't require much hindsight, it's an example of the "Byzantine Generals problem" described by Leslie Lamport 40 years ago, and should be well-known to anyone working on highly-available systems. With only two sites, in the event of network failure it's provably impossible for either of them to know what action to take. That's why such configurations always need a third site/device, with a voting system based around quorum. Standard for local HA systems, but harder to do with geographically-separate systems for DR because network failures are more frequent and often not independent.
In that case best Business Continuity practice is not to do an automatic primary-secondary failover, but to have a person (the BC manager) in the loop. That person is alerted to the situation and can gather enough additional info (maybe just phone calls to the site admins, a separate "network link") to take the decision about which site should be Primary. After that the transition should be automated to reduce the likelihood of error.