Been there, paid to do that
I did microprocessor validation at AMD & IBM for a decade about two decades ago.
I'm too busy to dig into these papers, but allow me to lay out what this sounds like. AMD never had problems of this sort while I was there (the validation team Dave Bass built was that good--by necessity.)
Many large customers of both companies found it worthwhile to have their own validation teams. Apple in particular had validation team that was frankly capable of finding more bugs than the IBM team did in the 750 era. (AMD's customers in the 486 & K5 era would tell them about the bugs that they found in Intel's parts & demand that we match them.)
Hard bugs are the ones that don't always happen--you can execute the same stream of instructions multiple times & get different results. This is almost certainly not the case for the "ransomware" bug. This rules out a lot of potential issues, including "cosmic rays" and "the Earth's magnetic field". (No BOFHs.)
The next big question is whether these parts behave like this from the time that they were manufacturers, or if they are the result of damage that accumulates during the lifetime of any microprocessor. Variations in manufacturing process can create either of these. We run tests before the dies are cut to catch the first case. For the latter, we do burn-in tests.
My first project at IBM was to devise a manufacturing test to catch a bug reported by Nintendo in about 3/1000 parts (AIR) during the 750 era. They wanted to find 75% of the bad parts. I took a bit longer than they wanted to isolate the bug, but my test came out 100% effective.
My point is that this has always been an issue. Manufacturers exists to make money. Burn-in tests are expensive to create--and even more expensive to run. You can work with your manufacturer about these issues or you can embarrass them. Sounds like F & G are going for the latter.
Oh, and I'm available. ;)