Jump to content

Binary entropy function

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Hermel (talk | contribs) at 17:54, 2 June 2013 (fix math). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Entropy of a Bernoulli trial as a function of success probability, called the binary entropy function.

In information theory, the binary entropy function, denoted or , is defined as the entropy of a Bernoulli process with probability of success p. Mathematically, the Bernoulli trial is modelled as a random variable X that can take on only two values: 0 and 1. The event is considered a success and the event is considered a failure. (These two events are mutually exclusive and exhaustive.)

If then and the entropy of X is given by

where is taken to be 0. The logarithms in this formula are usually taken (as shown in the graph) to the base 2. See binary logarithm.

When the binary entropy function attains its maximum value. This is the case of the unbiased bit, the most common unit of information entropy.

is distinguished from the entropy function by its taking a single scalar constant parameter. For tutorial purposes, in which the reader may not distinguish the appropriate function by its argument, is often used; however, this could confuse this function with the analogous function related to Rényi entropy, so (with "b" not in italics) should be used to dispel ambiguity.

Explanation

In terms of information theory, entropy is considered to be a measure of the uncertainty in a message. To put it intuitively, suppose . At this probability, the event is certain never to occur, and so there is no uncertainty at all, leading to an entropy of 0. If , the result is again certain, so the entropy is 0 here as well. When , the uncertainty is at a maximum; if one were to place a fair bet on the outcome in this case, there is no advantage to be gained with prior knowledge of the probabilities. In this case, the entropy is maximum at a value of 1 bit. Intermediate values fall between these cases; for instance, if , there is still a measure of uncertainty on the outcome, but one can still predict the outcome correctly more often than not, so the uncertainty measure, or entropy, is less than 1 full bit.

Derivative

The derivative of the binary entropy function may be expressed as the negative of the logit function:

Taylor series

The Taylor series of the binary entropy function in a neighborhood of 1/2 is

for

See also

References