Previous Topic

Next Topic

Book Contents

Book Index

Why You Need Really Random Numbers

Pseudorandom Number Generators: Not Really Random Numbers!

Most random number generators are in fact "pseudo-random" number generators. They generate values according to some internal equations. The nature of the equations tends to be such that the values appear to be random and perhaps even pass many statistical measures of randomness. However, all "pseudo-random" number generators have a "cycle". After a single cycle has passed, some underlying property of the numbers repeats in the same order as it appeared before.

Pseudo-random number generators are deployed in all common programming languages, including Visual Basic, C++, BASIC, Pascal, and many more. The combination of pseudo-random number generators and bad randomness implementations can be remarkably predictable.

Pitfalls of High Entropy Random Number Generators

Please don't be lured into believing that a random number generator that produces relatively high entropy is truly random (here we are using the -p * log p definition of entropy). By definition, the entropy statistic is calculated as a measure of "evenness" or "dispersion" in a sample. For example, a sample of 1 million bits in which exactly 500,000 are 0 and 500,000 are 1 has a bit-calculated entropy of 1.000 bits per bit.

Our experience is that most random number generators, including those of cryptographic quality, produce extremely high entropy results. Comparatively high entropy does not imply randomness. Consider the following:

Suppose we have an unbiased ZERO-ONE random bit generator X. Suppose that 95% of the time 1 million bits of data generated by X produces entropy values >= 0.999999 bits per bit. It's a simple exercise to show that this performance implies that 95% of the time the number of ONE bits out of 1 million bits generated by X is roughly in the range 500,000 +/- 500.

Binary data produced by a process that is truly random, defined for our purposes as.:

will not have entropy values as consistently high as for generator X. In fact, relatively simple calculations show that a truly random process will produce entropy values over 0.999999 bits per bit roughly only about 70% of the time. The fact that generator X produces large sample data closer to the mean implies that the variance of the sum of bits produced by generator X is less than the variance of those produced by a truly random process. Since variance is additive, the only way lower variance is possible is if the draws from generator X have overall slightly negative covariance and therefore the data produced by generator X cannot be random according to the definition above.

When the underlying binary data is not truly random, it is very difficult to produce transformations of the binary data that have desired randomness properties. For example, random permutations created by transformations of data that doesn't meet our criteria for randomness may have a bias towards certain outcomes, a distinctively undesirable property.

Really Random Numbers attempts to start with truly random binary data, so that all subsequent transformations yield consistently random results.

See Also

Introduction

Advantages of Really Random Numbers

Cryptographically Secure Random Numbers

Theoretical Underpinnings

Permutations and Combinations