Jump to content

Data processing inequality

From Wikipedia, the free encyclopedia
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

The data processing inequality is an information theoretic concept that states that the information content of a signal cannot be increased via a local physical operation. This can be expressed concisely as 'post-processing cannot increase information'.[1]

Statement

Let three random variables form the Markov chain , implying that the conditional distribution of depends only on and is conditionally independent of . Specifically, we have such a Markov chain if the joint probability mass function can be written as

In this setting, no processing of , deterministic or random, can increase the information that contains about . Using the mutual information, this can be written as :

with the equality if and only if . That is, and contain the same information about , and also forms a Markov chain.[2]

Proof

One can apply the chain rule for mutual information to obtain two different decompositions of :

By the relationship , we know that and are conditionally independent, given , which means the conditional mutual information, . The data processing inequality then follows from the non-negativity of .

See also

References

  1. ^ Beaudry, Normand (2012), "An intuitive proof of the data processing inequality", Quantum Information & Computation, 12 (5–6): 432–441, arXiv:1107.0740, Bibcode:2011arXiv1107.0740B, doi:10.26421/QIC12.5-6-4, S2CID 9531510
  2. ^ Cover; Thomas (2012). Elements of information theory. John Wiley & Sons.