Jump to content

Discrete probability distribution

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Stpasha (talk | contribs) at 01:31, 12 October 2010 ({{mergeto|Probability distribution}}). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
The probability mass function of a discrete probability distribution. The probabilities of the singletons {1}, {3}, and {7} are respectively 0.2, 0.5, 0.3. A set not containing any of these points has probability zero.
The cdf of a discrete probability distribution,...
... of a continuous probability distribution,...
... of a distribution which has both a continuous part and a discrete part.

In probability theory and statistics, a discrete probability distribution is a probability distribution characterized by a probability mass function. Thus, the distribution of a random variable X is discrete, and X is then called a discrete random variable, if

as u runs through the set of all possible values of X. It follows that such a random variable can assume only a finite or countably infinite number of values. That is, the possible values might be listed, although the list might be infinite. For example, count observations such as the numbers of birds in flocks comprise only natural number values {0, 1, 2, ...}. By contrast, continuous observations such as the weights of birds comprise real number values and would typically be modeled by a continuous probability distribution such as the normal.

In cases more frequently considered, this set of possible values is a topologically discrete set in the sense that all its points are isolated points. But there are discrete random variables for which this countable set is dense on the real line (for example, a distribution over rational numbers).

Among the most well-known discrete probability distributions that are used for statistical modeling are the Poisson distribution, the Bernoulli distribution, the binomial distribution, the geometric distribution, and the negative binomial distribution. In addition, the discrete uniform distribution is commonly used in computer programs that make equal-probability random selections between a number of choices.

Alternative description

Equivalently to the above, a discrete random variable can be defined as a random variable whose cumulative distribution function (cdf) increases only by jump discontinuities—that is, its cdf increases only where it "jumps" to a higher value, and is constant between those jumps. The points where jumps occur are precisely the values which the random variable may take. The number of such jumps may be finite or countably infinite. The set of locations of such jumps need not be topologically discrete; for example, the cdf might jump at each rational number.

Consequently, a discrete probability distribution is often represented as a generalized probability density function involving Dirac delta functions, which substantially unifies the treatment of continuous and discrete distributions. This is especially useful when dealing with probability distributions involving both a continuous and a discrete part.

Representation in terms of indicator functions

For a discrete random variable X, let u0, u1, ... be the values it can take with non-zero probability. Denote

These are disjoint sets, and by formula (1)

It follows that the probability that X takes any value except for u0, u1, ... is zero, and thus one can write X as

except on a set of probability zero, where is the indicator function of A. This may serve as an alternative definition of discrete random variables.

See also