So...
Microsoft STILL doesn't understand that test and production systems should be kept completely isolated from each other. Sheesh, when will they learn!
A code deployment for Azure Container Apps that contained a misconfiguration triggered prolonged log data access issues, according to a technical incident report from Microsoft. The incident, which began at 23:15 UTC on July 6 and ran until 09:00 the following day, meant a subset of data for Azure Monitor Log Analytics and …
This post has been deleted by its author
Why is it so hard for programmers to think of diff'ing the incoming config against the currently active one?
No, no, let me guess: there is a timestamp field for when the config is sent out and because that has progressed by five to ten seconds...
Sigh.
After a service fails and fails again within 5 seconds of the restart what is the chance it will run fine after another restart with the next 5 seconds?
Micros~1 should have learned to (rate) limit the automatic restarts for services that fail directly after initialization a long time ago.
More importantly: Why is there not a little thing somewhere saying "Hey, this service that never normally restarts is suddenly doing it every 5-10 seconds"?
Simple statistics to detect frequency changes and unusual behaviour like that.
Or whoever is in charge of this global, business-critical, spans-all-customers, service should have at least some kind of alert and then go "WOAH! Everyone stop and tell me what's changed in the last 10 minutes!".