Random number generation in C++11: how to generate, how does it work?

user72424 picture user72424 · Aug 18, 2011 · Viewed 76.6k times · Source

I recently came across new way to generate random numbers in C++11, but couldn't digest the papers that I read about it (what is that engine, maths term like distribution, "where all integers produced are equally likely").

So can anyone please explain

  • what are they?
  • what does they mean?
  • how to generate?
  • how do they work?
  • etc

You can call it all in one FAQ about random number generation.

Answer

Kerrek SB picture Kerrek SB · Aug 18, 2011

The question is way too broad for a complete answer, but let me cherry-pick a couple of interesting points:

Why "equally likely"

Suppose you have a simple random number generator that generate the numbers 0, 1, ..., 10 each with equal probability (think of this as the classic rand()). Now you want a random number in the range 0, 1, 2, each with equal probability. Your knee-jerk reaction would be to take rand() % 3. But wait, the remainders 0 and 1 occur more often than the remainder 2, so this isn't correct!

This is why we need proper distributions, which take a source of uniform random integers and turn them into our desired distribution, like Uniform[0,2] in the example. Best to leave this to a good library!

Engines

Thus at the heart of all randomness is a good pseudo-random number generator that generates a sequence of numbers that uniformly distributed over a certain interval, and which ideally have a very long period. The standard implementation of rand() isn't often the best, and thus it's good to have a choice. Linear-congruential and the Mersenne twister are two good choices (LG is actually often used by rand(), too); again, it's good to let the library handle that.

How it works

Easy: first, set up an engine and seed it. The seed fully determines the entire sequence of "random" numbers, so a) use a different one (e.g. taken from /dev/urandom) each time, and b) store the seed if you wish to recreate a sequence of random choices.

#include <random>

typedef std::mt19937 MyRNG;  // the Mersenne Twister with a popular choice of parameters
uint32_t seed_val;           // populate somehow

MyRNG rng;                   // e.g. keep one global instance (per thread)

void initialize()
{
  rng.seed(seed_val);
}

Now we can create distributions:

std::uniform_int_distribution<uint32_t> uint_dist;         // by default range [0, MAX]
std::uniform_int_distribution<uint32_t> uint_dist10(0,10); // range [0,10]
std::normal_distribution<double> normal_dist(mean, stddeviation);  // N(mean, stddeviation)

...And use the engine to create random numbers!

while (true)
{
  std::cout << uint_dist(rng) << " "
            << uint_dist10(rng) << " "
            << normal_dist(rng) << std::endl;

}

Concurrency

One more important reason to prefer <random> over the traditional rand() is that it is now very clear and obvious how to make random number generation threadsafe: Either provide each thread with its own, thread-local engine, seeded on a thread-local seed, or synchronize access to the engine object.

Misc

  • An interesting article on TR1 random on codeguru.
  • Wikipedia has a good summary (thanks, @Justin).
  • In principle, each engine should typedef a result_type, which is the correct integral type to use for the seed. I think I had a buggy implementation once which forced me to force the seed for std::mt19937 to uint32_t on x64, eventually this should be fixed and you can say MyRNG::result_type seed_val and thus make the engine very easily replaceable.