Non-uniform random variate generation

Pseudo-random number sampling is the numerical practice of generating pseudo-random numbers that are distributed according to a given probability distribution.

In the following, it is taken for granted that there is a pseudo-random number generator producing numbers X that are uniformly distributed. Different algorithms are then used to transform X into a random variable U(X) that is distributed according to a given distribution f(C).

Historically, basic methods of pseudo-random number sampling were developed for Monte-Carlo simulations in the Manhattan project; they were first published by John von Neumann in the early 1950s.

Finite discrete distributions

For discrete probability distributions with a finite number n of values (at least, a finite number of values having nonzero probability), the basic sampling algorithm is straightforward. The interval [0,1) is divided in n intervals [0,f(1)), [f(1),f(1)+f(2)), ... The width of interval i equals the probability f(i). One draws a uniformly distributed pseudo-random number X, and searches for the index i of the corresponding interval. The so determined i will have the distribution f(i).

Formalizing this idea becomes easier by using the cumulative distribution function

F(i)=\sum _{j=1}^{i}f(j).

It is convenient to set F(0)=0. The n intervals are then simply [F(0),F(1)), [F(1),F(2)), ..., [F(n-1),F(n)). The main computational task is then to determine i for which F(i-1)≤X<F(i).

This can be done by different algorithms:

Linear search, computational time linear in n.
Binary search, computational time goes with log n.
Indexed search^[1], also called the cutpoint method^[2].
Alias method, computational time is constant, using some pre-computed tables.
There are other methods that cost constant time.^[3]

Continuous distributions

Generic methods:

Rejection sampling
Inverse transform sampling
Slice sampling
Ziggurat algorithm, for monotonously decreasing density functions
Convolution random number generator, not a sampling method in itself: it describes the use of arithmetics on top of one ore more existing sampling methods to generate more involved distributions.

For generating a normal distribution:

For generating a Poisson distribution:

See Poisson distribution#Generating Poisson-distributed random variables

Footnotes

^ see Ripley
^ Fishman
^ see Fishman

Literature

Ripley,B D: Stochastic Simulation. Whiley (1987).
Fishman,GS: Monte Carlo. Concepts, Algorithms, and Applications. New York: Springer (1996).
Knuth,DE: The Art of Computer Programming, Vol. 2 Seminumerical Algorithms, Chapter 3.4.1 (3rd edition, 1997).

[1] see Ripley

[2] Fishman

[3] see Fishman

[1]

[2]

[3]