random...
They could seed the RNG with MS' next OS release date
The pseudo-random number generator used by Microsoft in Windows is flawed, according to security researchers. A team of cryptographers led by Dr. Benny Pinkas from the Department of Computer Science at the University of Haifa, Israel were able to unravel how the CryptGenRandom function Windows 2000 worked, without assistance …
There was a Dilbert cartoon where he was touring accounts. Sitting at a desk was an accounting troll(Who according to Dilbert was made of 90% snot), repeatedlly saying 9,9,9. When Dilbert asked the Head of accounting troll what he was doing, he was told,"He's the random number generator, you never really know do you? "Heheh
An age old problem with random generators! Now, I have seen some companies giving the "trivial" task to code the random number generator to, let's say a little inexperienced, developers and never testing it. But today when random nuber generators are so important ?? It is tedious and takes time to test any new or even old generator on new system / platform but the methods are widely available.
From the Pre-Windows days. "Why bother redoing an algorithm for such a trivial thing as an RNG just because we're gluing a new front-end to DOS?"
And then as the OS is further upgraded and rebuilt from Windows 1.0 to the start of the DOS-less lineage, the thought would have been "Well, that code works. Why waste the time?"
And then the DOS-less branch starts up. "Why should we write our own RNG code? Lets just gank the one out of the DOS version."
And then as that grows into 2k, XP, Vista, etc. you get "That code works. Don't waste the time."
The idea of testing a random number stream by compressing it is both true and useless at the same time. A perfect random stream is indeed incompressible. But any pseudo-random stream is always compressible. The problem is in two parts.
To any computer user "compress" means using a standard compression utility. Which won't work. These are all crafted to take advantage of the low entropy in the target source stream. So one can compress text files due to the redundancy in the language and can use even simple coding (such as Huffman.) Similarly, loss-less music compression and loss-less image compression are designed to look for common patterns that arise in the nature of the source.
One could not take a random number stream and try to compress it with any of these. Even if created with a badly broken generator these compressors would not find any traction.
The key concept is algorithmic information content. Working out the minimum sized device that is capable of reproducing the input stream. The Kolmogorov complexity. The issue with any software RNG is that this complexity is low. It is after all nothing but the RNG function itself, plus the seed. This is irrespective of the implementation or algorithm.
What is broken in the MS RNG appears to not be the algorithm itself, but rather the surrounding implementation. This suffers from two flaws. Emitting a very large stream of deterministic data before getting a new seed, and allowing the seed to be seen externally. It matters nought what the algorithm itself is, with these two flaws in implementation it is fatally damaged. The Kolmogorov complexity requires only knowledge of the algorithm (trivially reverse engineered from the machine code) and the current value of the seed, which we hear is subject to attack and discovery. Since the algorithm itself is fixed, we have a situation where 128k numbers are encoded by a single seed value. This shows the relationship between compression and random numbers. We can create a compression of the stream that is 128 thousand to one. But again, this is true of any pseudo-random generator. This is why seed renewal is so important (and keeping the seed secure.)
Creating a good quality seed is also hard. The entropy present in many seed generators is not high. Grabbing "random" values out of various system registers and the like is very unlikely to get you anything approaching a full word width of entropy. Even physical processes are subject to contamination from nearby periodic noise sources, reducing their entropy.
In all it is a hard problem. However the two blunders in the MS implementation are unforgivable. They suggest that the code was written by someone who had little or no understanding of the issues.
If you add up that there is no LOCK mechanism for memory segments to avoid them going to swap, that compromises even more any crypto process in Windoze.
Now that I think it, you don't need to monkey around with RNG's. Just wait for IE to go over to swap space, and wham, you got your decrypted SSL keys right there in the swapping partition. Oops!
Are we sure Vista and XP still do it this way? Security was a real focus for Vista so maybe they changed it.
Also do we know if the SSL and crypto libs use this function or did they cook their own? Or did they do more to randomize the number?
Have there been any successful attacks using this method?
This is a genuine extract from a document for the Finnish Bankers' Association. Note the second line of the first paragraph particularly, regarding key-generation:
------
Appendix 3: Key management
A key common to all banks is used in the calculation of the authentication identifier. The key is generated in the Finnish Bankers’ Association by tossing a coin 64 times and entering the result so that heads is 0 and tails is 1. The 8-bit bytes of the 64-bit key are given an odd parity, the bits are converted into a hexadecimal format and the result is the key common to all banks.
The key is transferred to reliable people within the banks and they enter the keys into the same or equally protected system as the system where the PATU dongles are stored. The technical solution is bank-specific but the security level must be the same.
------
Predict that!
With no real, effective peer review of the algorithm and implementation, are we surprised?
I have reason to believe, based on knowing several of them, that the technical folks working at Microsoft are generally technically competent. Unfortunately, the Microsoft corporate culture doesn't actually allow any of that talent to create quality products, as the competent technical people are, as usual, simply peons to the mostly technically incompetent managers. Managers whose focus is strictly on the next quarter, not on long term quality.
Ah well. Eventually such a system can hopefully only crumble as people realize that they're being ripped off. Until then, it's a good lesson in how fools and their money are soon parted.
Following the recent article on the pet 2001. We noticed this problem with the rnd generator in basic. we developed a work round; seed the random generator with -ti. The negated time value was a relatively random seed. so running the same code sequentially would get a different value from the rnd function each time.
Nearly 30 feckin' years and Microsoft STILL haven't fixed it.
I loved pointing out to my fellow developers how easy it was to work out a 'random' key based on time/ticks/any system count.
Your clock resolution is in the milliseconds. you used the key at 14/11/2007 at 10:45:30 AM, how many seeds to I have to try before i hit on your seed, and this your key......
If this is such an issue for generating SSL keys then why don't those programs use their own random number generators instead of using Windows' one?
Personally I've found it a godsend that the random number generator routine has stayed the same all through every 32 bit Windows. That way you can use it in games to create random maps for example that will be the same on any version of Windows if you use the same seed number.
Don't blame MS if you're using their simple random number routine when you need something a bit more sophisticated. Get off your arse and write your own.
Quote "It's absolutely impossible for a computer to generate truly random numbers. It can generate truly huge numbers but any number generation based on a formula or algorithm cannot be truly random"
Yes, and that is why Linux "collects entropy" from physical events like network packet arrival times, keypresses and some other places to make the random numbers far less predictable. Some VIA CPU:s even contain a hardware random number generator that can be used. The article implies Windows also collects entropy, but not enough to avoid this attack.
So what's new? When I started in computers nearly 30 years ago, I was told the limitations of the so called random number that were available. Even before this I remember the great fanfare that greeted the latest Ernie - the computer that picks the winners of the premium bonds in the UK, but time has shown that it's no truly or pseudo random...
The big problem is that it's been chosen as the feed for SSL encryption, despite well known limitations. Perhaps they thought it was good enough... Perhaps they didn't think.
There's a good wikipedia article on random number generation, suggest people read it, in it there's this quote:
'John von Neumann famously said "Anyone who uses arithmetic methods to produce random numbers is in a state of sin."'
Why not hook up a webcam infront of a lava-lamp, the computer takes the image as it's source data and generates a number from that, using a formula if you like - providing the attacker never sees the lava lamp from the same angle and at the same distance, it would be impossible to crack....
In 30 years of experience with computers I have discovered only 2 truly random numbers ever to be generated...
Firstly: The length of time it took windows 98 to BSoD
Secondly: IF XP was going to install on a formatted drive.
The former gave a random time of anything up to (but it seems not over) 8 hours whilst the latter was the equivelent of a "coin flip".
Not sure why you guys think compression has anything to do anything random. My random number generator just happened to spit out 1111111111111... that should compress just fine. Well ok it didn't really but a "perfect", "truly" random generator is just as likely to spit that out as anything else so back to the drawing board for you both.
An application can add additional entropy when calling CryptGenRan(). MS have been always clear how the entropy is collected thus it is not surprising you can break it. If you have got a better source of entropy you can use it. The generator behind that call is good enough, the real problem is collecting the entropy.
Random numbers are impossible to generate on a computer. The only random things in nature are quantum mechincal based, like the momentum of an electron emitted from a radioactive decay.
Probably the best random number generator on the market at the moment is TRandom3 with a large period of 2**19937-1, found at:
http://root.cern.ch/root/html/TRandom3.html
However, as with all the best tools, this has been GPL'ed. If you need a RNG, try TRandom3.
. . . read "Not Easy". Thank's Francis - an excellent analysis of the situation.
I might disagree that "a perfect random stream is indeed incompressible". A short, random stream may, entirely by chance, contain a sequence of bits that is susceptible to some form of compression, allowing the stream to be compressed somewhat. However, the larger the stream, the less chance of this being possible. And, as you point out, common compression routines rely on the expected nature of the input stream - compressing "random" data has little effect, pseudo- or otherwise.
As far as using the time as a random seed (@wobbly1 and @Matware), using -ti on the Pet WAS good practice - because (AFAIR) the PET, like most early computers, didn't have a real-time clock. The time counter on old machines generally counted ticks since system power-up. If this counted in milliseconds then, after just over a minute, the least-significant 16 bits of this number would provide a reasonably good random seed.
For "Modern" PCs, you could subtract the boot time from the real time and again take the least-significant portion. Knowing what time the generator had been seeded would then be useless without knowing when the system was last booted.
No deterministic machine can produce a truly random number. All computers user pseudo-random number algorithms. Indeed, this is often relied on: seeding the RNG can be used as a development tool (reproduce a "random" input stream that causes progem problems) or as in the example of Sean O'Connor, to consistently re-produce "random" maps.
Pseudo-RNGs are quite acceptable - as long as you know their limitations and the seed is well-chosen and protected. That's why being a good programmer is not just about coding. You need a good grounding in maths (particularly statistics & probability) so you at least know where the traps lie!
@AC:
That's insecure. Probably more insecure than Windows SSL certificates. Here's how you can tell:
"The key is transferred to reliable people within the banks and they enter the keys into the same or equally protected system as the system where the PATU dongles are stored"
They might as well claim that the key is transferred to the Kingdon of the Elves and that a big giant will club anyone who tries to take it away.
Rabbi, actually the longer you go, the *more* likely you are to find a compressible sequence. Suppose your compression is nothing more than run-length encoding, so every time you receive the same character more than once, you can compress it to say "2 As". The longer your random sequence, the greater the chances of finding two or more of the same character next to each other, just as you're more likely to get five consecutive coin-flips landing on heads in a series of a thousand flips than a series of six.
This reverses the claims made by other people above. If you *don't* get some compressible chunks in your sequence, that's a sure-fire indication that the sequence is *not* random, because any random sequence *would* have a few compressible chunks! If there are no compressible chunks, that indicates there's an algorithm behind it which is making sure the same output value isn't repeated until all other possible values have been used. That's a serious weakness in the algorithm, because if you can take a guess at the starting point and the range of values possible, you can predict future values by knowing which previous values can't come up this round - the digital equivalent of card-counting, essentially.
The problem is that humans are so deeply wired for pattern-recognition that we aren't very good at handling non-patterns - in particular, we intuitively see an apparently predictable sequence as a fault in a random system. So a truly random system that throws up "ABCD" or "AAAA" looks to us to be faulty, whereas a non-random system that throws up "ZQTW" looks like it's working.
Deterministic state machines found to behave deterministically. Who would have thought that?
If you want to get random numbers, you need some non-deterministic process. Radioactive decay is commonly used, but nobody is going to want a computer with a radioactive device in it (although I bet they've all got smoke detectors). Feeding amplified static into an A-to-D converter might also work.
there are a number of well-respected test suites for randomness available, here, here and here:
(NIST) http://csrc.nist.gov/groups/ST/toolkit/rng/index.html
(Diehard) http://stat.fsu.edu/pub/diehard/
(l'Ecuyer) http://www.iro.umontreal.ca/~simardr/indexe.html (in English)
(en francais) http://www.iro.umontreal.ca/~simardr/
there is (was?) a French development in the area of collecting entropy from modern CPU internal state, here:
http://www.irisa.fr/caps/projects/hipsor/old/HAVEGE.html (in English)
There are a number of hardware entropy generators, the performance of these varies, see here:
http://www.robertnz.net/true_rng.html
Also, go to Robert's home page for more stuff
> A key common to all banks is used in the calculation of the authentication identifier. The key is generated in the Finnish Bankers’ Association by tossing a coin 64 times
So are we saying that people working in banks are tossers as well as bankers?... (anonymous cos I work for a bank)
Just for the lot here that misses the point. This article contains 3 very important points:
- researchers reconstituted the state of the generator *without* knowing how it worked. The generator must really be a load of crap, then.
- each state is used to generate 128 KB data. That's f***ing *huge*. It means the attacker, after he knows it, will get access to the keys for a lot of the future new sessions (SSH, SSL, whatever else), before they're started !
Linux random generator, if I recall, spits out only 512 bytes (from /dev/random) from a known state, then it needs more entropy.
- also, being able to access to the generator from user privileges opens a lot to attackers.
As usual, very good work from MiKrosoft on the security aspect. Could El Reg send them a copy of Knuth books as Christmas gift ?
PS: for people asking themselves why SSL doesn't implement its own gen, that's because a random gen needs entropy, and the only way to collect that on a computer is to access low level timers/events etc ... which is only accessible from kernel, not userland where SSL runs.
The random number generator in Excel is indeed predictable, and was responsible for a problem in that Noel Edmond atrocity, Deal or No Deal, where the adjudicators used Excel's pseudorandom generator to determine what values went in which boxes. This led to a predictable pattern in the game, and several shows where all the boxes had the same values as a previous show (in some cases mere days before). This went unnoticed for almost three months before they changed the selection method.
" The only random things in nature are quantum mechanical based, like the momentum of an electron emitted from a radioactive decay."
Just because you have not detected any pattern to date using the measuring apparatus you have employed to observe quantum mechanical events, it doesn't mean that there isn't any pattern. Is this idea a statement of physics or faith ? If you want to know where one ends and the other begins it helps to check your unstated but working assumptions. Interestingly the New Scientist not long ago described the idea that genuine randomness exists as a matter of faith, like the idea of the existence or non-existence of God.
Building upon your brilliant logic, we shouldn't use the provided printf() either in case it's broken! Let's have all our programs do everything themselves rather than using common system libraries. The logical conclusion is that we'd end up with every program implementing its own incompatible drivers for any piece of hardware it ever needed to talk to.
Actually, that's how it used to be in the DOS days.
@Sean: You're mixing up two very different windows PRNGs - the simplistic rand() function from the MSVCRT is the one you're thinking of, whereas this article is about the RNG in the windows CryptoAPI subsystem.
@David: That's not your idea at all. Credit where it's due: http://www.lavarnd.org/
@wobbly1
"We noticed this problem with the rnd generator in basic."
i.e. We didn't read the manual and guessed, incorrectly, at how the function was supposed to work. When we finally worked out what it was doing, exactly as designed, we blamed it for not being how we expected instead of blaming ourselves for guessing wrong.
" we developed a work round; seed the random generator with -ti."
i.e. we finally read the manual and learned how to use the function correctly.
"Nearly 30 feckin' years and Microsoft STILL haven't fixed it."
i.e nearly 30 years and you still haven't understood what you were doing wrong.
The PET's rand function was carefully designed. Use a negative argument to set the seed. Use a zero argument to get the next value in the series from the current seed. Use a positive argument to get the exact same constant single answer (for any given positive input value) every time. This gives you the ability to easily have a constant result or a constant series of results during development, and a properly pseudo-random series in release versions of the software, simply by changing the value you pass to rand(). It's your fault you passed the wrong argument and passing the right one isn't "developing a workround" for any problem in the rand() implementation, it's fixing the bug in *your* code.
The article was well written and to the point.
This flaw isn't new, random number generation that didn't have some form of exterior input isn't truly random. (exterior input like radio noise from space, isotope decay, or thermo resistors, etc ...)
I was surprised at the simplicity of the attacks, and the lack of sophistication on the part of Microsoft.
As the article points out, the random number generator is central to the SSL mechanism. So how can a company claim to place security high on its list fail so badly?
Seriously? Why would anyone bother to test something like this in Windows 2K, and not test it in XP? And if the attack is as trivial as they suggest, why would they not verify whether the vulnerability still exists in XP or Vista?
It strikes me that there's something fishy about this report.
No one test for randomness is "best" under all circumstances. Moreoever, the tests most useful in one context may not be useful in another context.
Frequency distributions are helpful, but by no means a panacea. Moreover for modest strings of "random" digits, there is a predictable variation in the number of occurences of each digit, and too flat a distribution is suspicious.
The thing to keep in mind that the randomness boffins long ago investigated these issues in minute detail. Given the resultant huge literature on the generation of quasi-random numbers, how to do it, how not to do it, how to test it, Microsoft's failure to do better than they did is simply disgusting.
One wonders how many latent bugs there are in Vista that would have been stepped on if Vista weren't in bed giving blowjobs to the movie/music cartel.
Gosh to have had the benefits of your perspicacity then!
We of course foolishly concluded that a function with a numeric output described as "random" would produce a "random" number. During development we discovered our numbskull mistake.
Manuals... Luxury! This was during the first months of the PET being in the country before the sales push , i worked for one of the first dealers , we had two machines , we didn't have the manuals, we had a folder of photocopy sheets, some missing some illegible.
Nigel, (For it was he), came in one morning with the -ti workround. whether he had spent half a weeks wages telephoning the States (where the manuals where available ) or had divine inspiration is dimmed with mists of time...
A few points:
- Yes, computers might be deterministic, finite state machines. But that doesn't mean cryptographically-secure sequences can't be generated algorithmically. In any case, the problem here is the sourcing and refreshing of the entropy, not the algorithm (although implementation in user-mode was perhaps a poor choice).
- Banks are generally very good at key management. Not all keys reside in a single place--for some applications, they need to be shared between systems, or backed up. Using XOR key shares with independent key custodians is best practice, and secures the entire credit card industry (including your PIN) for a start. Remember--it's manual procedures that establish trust and security, all this technical stuff just helps to preserve it afterwards.
- However, any bank resorting to coin tossing should be shot. Banks (and other serious users) use Hardware Security Modules, from vendors such as nCipher, Thales, Atalla (HP), Safenet and IBM. These provide hardware entropy sources, and also secure keys safely away from server memory.
- Intel chipsets have indeed included thermal-noise RNGs for years. Poor that Windows chooses not to take advantage.
That's a problem with documentation. Anyone who's done any programming at all, will be quite clear that API functions usually do something different to what you expect it to do. Without reading documentation on it (even if the docs weren't available to you), you can't complain that it's wrong, and you're still referring to it as a workaround, when it is actually just the correct way of generating random numbers with the function. The real problem is the lack of documentation, nothing whatsoever to do with the function or what it does.
There are too many points I could make here, but the most pressing one is who is trusting the RNG in the first place? Perhaps the subject is not too well understood - I note I only covered it at Uni as subject matter for a seminar, and even then the subject was of my own choosing. But the first thing you learn when you study the things is not to trust them. An RNG needs testing to ensure its suitability for the problem being tackled, and the amount of testing that needs doing depends on the problem.
If you take for instance, a poker game (for fun, not betting), it probably doesn't matter too much if the random number generation isn't perfect, but anything using randomness in a statistical or security scenario needs proper testing for the specific purpose for which it is proposed. For example, one of my projects needs an RNG with an extremely long period and no bias towards any particular range. However, it would usually be happy with a simple sequence 1,2,3,4,5... It is only 'unusual' usage patterns that require randomness at all. This particular set of requirements isn't covered by simply saying "this is a good/bad RNG".
The point is that anyone using an OS function for anything more important than the poker example I gave above is the one at fault for failing to ensure that the generator is suitable for their own particular requirements. This is a case of buyer beware.
Having said this, like others I'd like to see (tested, sound) hardware random number generation on board somewhere as standard equipment. I think it is too much to hope that people will consider this too much when writing software, and so easy access to decent source of randomness must be a good thing.
Here is one of the Lava lamp projects:
http://www.lavarnd.org/news/lavalite-care.html
But my favourite has always been Hotbits, the radioactive decay random number generator. They buried the radioactive source and detector in an old water storage tank underground to make sure it didn't suffer any bias. But it takes the count of a Geiger counter and then sees if the number is odd or even to generate a 1 or 0. Random enough for me:
http://www.fourmilab.ch/hotbits/