Source coding theorem

In information theory, the source coding theorem (Shannon 1948) informally states that:

"N i.i.d. random variables each with entropy H(X) can be compressed into more than NH(X) bits with negligible risk of information loss, as N tends to infinity; but conversely, if they are compressed into fewer than NH(X) bits it is virtually certain that information will be lost." (MacKay 2003).

Devising coding strategies to achieve successfully this compression is the basis of the field of entropy encoding.

A more mathematical statement of the theorem is:

Let X be an ensemble with entropy H(X)=H bits. Given ε > 0 and 0 < δ < 1, there exists a positive integer N₀ such that for N > N₀,

{\Big |}{\frac {1}{N}}H_{\delta }(X^{N})-H{\Big |}<\epsilon

where H_δ(X)=log₂|S_δ(X)|; and S_δ(X) is the smallest subset of values of X such that the probability that x is not in S_δ is less than δ. (MacKay 2003).

The source coding theorem is closely related to the asymptotic equipartition property, and the notion of the typical set.

This mathematics-related article is a stub. You can help Wikipedia by expanding it.