Re: Interesting effect, wrong explanation
Transistor variability as a factor in making results unpredictable is really just to remove the obvious concern of "well, if the target machine is using an i7-7700K, then I can know the possible values from his RNG seeder just by buying an i7-7700K myself!" I call it the "same CPU" loophole, since it kinda makes sense that having the same CPU *should* result in collecting the same timing values (all else being equal, like OS and all other platform stack components).
But that's not so. In the cited Lawrence Livermore National Laboratory paper, they had thousands of identical servers, and not a pair had similar characteristics when they profiled them under similar load.
As for running the same task (again, after making sure it isn't optimized by the compiler, as our point is to "make the CPU work, get running time, rinse&repeat"), there are lots of factors there other than transistor variability. Data locality, cache access, temp, voltage, task scheduling and background tasks, thread migration, dynamic power and frequency scaling... there's a lot at play, and right now it's extremely hard to account for all of them. We just know it works, because we've tested in a wide variety of platforms, from an Arduino Uno microcontroller, Raspberry Pis ,small-core AMD/Intel, big-core AMD/Intel, etc.
The best we could do to make sure we minimize OS noise is to make sure each test is run with the absolute highest priority (nice -20). We also make sure each machine has minimal services running. For machines that are physically accessible, we also made sure to turn off network adapters.
The Arduino Uno is probably the best case. It literally does nothing but the uploaded sketch, which is just that loop over and over, and sending it to a laptop that collects the info. It still works.
Now, I have no doubt there needs to be more work done. If 10 years from now we want that generation of IT people to think "Huh? Why did you old-timers ever have a problem with seeding??? LOL, every CPU is an entropy machine, why did you guys ever need those HWRNGS?" and make OS RNG seeding a problem of the past and actively a non-concern, we should be working on simplifying the work loop (it has to be auditable, extremely so, to deter backdoors and other sorts of "oopsies"), testing on all platforms, and standardizing on the simplest, most auditable, and yet still effective technique across the board (all devices and platforms).
That's where I hope this research is headed. I want the simplest way of gathering entropy, so simple that it's auidtable in one scan of an eyeball, even on live environments. And I want this simplest way to apply, mostly identically, across all devices, from embedded, IoT, phones, laptops, large servers, everything. That's the blue sky scenario. When our tech gets to the point that seeding the OS RNG requires nothing but the CPU itself, and it only ever needs to do one standard algorithm across the board, then we've hit nirvana. Anyone who audits the seeding will expect the same thing everywhere, so it's harder to get confused, and therefore harder for anyone to get away with a backdoor in the seeding process. And if we rely on just any CPU (which, by definition, is what makes a device a device), then we know all our devices will get access to stronger cryptography. If we demand manufacturers to add diodes, HWRNGs, or come up with their own "safe" implementation of a noise source, you know they just won't do it (cost factor) or they'll screw it up (we can't have people roll their own crypto, that's why we need standards and research)