Data structure for loaded dice?

algorithm language-agnostic data-structures random probability

templatetypedef · Feb 17, 2011 · Viewed 11.8k times · Source

Suppose that I have an n-sided loaded die where each side k has some probability p_k of coming up when I roll it. I'm curious if there is good algorithm for storing this information statically (i.e. for a fixed set of probabilities) so that I can efficiently simulate a random roll of the die.

Currently, I have an O(lg n) solution for this problem. The idea is to store a table of the cumulative probability of the first k sides for all k, them to generate a random real number in the range [0, 1) and perform a binary search over the table to get the largest index whose cumulative value is no greater than the chosen value. I rather like this solution, but it seems odd that the runtime doesn't take the probabilities into account. In particular, in the extremal cases of one side always coming up or the values being uniformly distributed, it's possible to generate the result of the roll in O(1) using a naive approach, though my solution will still take logarithmicallh many steps.

Does anyone have any suggestions for how to solve this problem in a way that is somehow "adaptive" in it's runtime?

EDIT: Based on the answers to this question, I have written up an article describing many approaches to this problem, along with their analyses. It looks like Vose's implementation of the alias method gives Θ(n) preprocessing time and O(1) time per die roll, which is truly impressive. Hopefully this is a useful addition to the information contained in the answers!

Answer

You are looking for the alias method which provides a O(1) method for generating a fixed discrete probability distribution (assuming you can access entries in an array of length n in constant time) with a one-time O(n) set-up. You can find it documented in chapter 3 (PDF) of "Non-Uniform Random Variate Generation" by Luc Devroye.

The idea is to take your array of probabilities p_k and produce three new n-element arrays, q_k, a_k, and b_k. Each q_k is a probability between 0 and 1, and each a_k and b_k is an integer between 1 and n.

We generate random numbers between 1 and n by generating two random numbers, r and s, between 0 and 1. Let i = floor(r*N)+1. If q_i < s then return a_i else return b_i. The work in the alias method is in figuring out how to produce q_k, a_k and b_k.

Data structure for loaded dice?

Answer

Related questions