Algorithm to generate Poisson and binomial random numbers?

snap picture snap · Aug 6, 2009 · Viewed 42.9k times · Source

i've been looking around, but i'm not sure how to do it.

i've found this page which, in the last paragraph, says:

A simple generator for random numbers taken from a Poisson distribution is obtained using this simple recipe: if x1, x2, ... is a sequence of random numbers with uniform distribution between zero and one, k is the first integer for which the product x1 · x2 · ... · xk+1 < e

i've found another page describing how to generate binomial numbers, but i think it is using an approximation of poisson generation, which doesn't help me.

For example, consider binomial random numbers. A binomial random number is the number of heads in N tosses of a coin with probability p of a heads on any single toss. If you generate N uniform random numbers on the interval (0,1) and count the number less than p, then the count is a binomial random number with parameters N and p.

i know there are libraries to do it, but i can't use them, only the standard uniform generators provided by the language (java, in this case).

Answer

Kip picture Kip · Aug 6, 2009

Poisson distribution

Here's how Wikipedia says Knuth says to do it:

init:
     Let L ← e^(−λ), k ← 0 and p ← 1.
do:
     k ← k + 1.
     Generate uniform random number u in [0,1] and let p ← p × u.
while p > L.
return k − 1.

In Java, that would be:

public static int getPoisson(double lambda) {
  double L = Math.exp(-lambda);
  double p = 1.0;
  int k = 0;

  do {
    k++;
    p *= Math.random();
  } while (p > L);

  return k - 1;
}

Binomial distribution

Going by chapter 10 of Non-Uniform Random Variate Generation (PDF) by Luc Devroye (which I found linked from the Wikipedia article) gives this:

public static int getBinomial(int n, double p) {
  int x = 0;
  for(int i = 0; i < n; i++) {
    if(Math.random() < p)
      x++;
  }
  return x;
}

Please note

Neither of these algorithms is optimal. The first is O(λ), the second is O(n). Depending on how large these values typically are, and how frequently you need to call the generators, you might need a better algorithm. The paper I link to above has more complicated algorithms that run in constant time, but I'll leave those implementations as an exercise for the reader. :)