Jump to content

Information theory and measure theory

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Calbaer (talk | contribs) at 19:57, 20 June 2006. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

If we associate the existence of sets and with arbitrary discrete random variables X and Y, somehow representing the information borne by X and Y, respectively, such that:

  • whenever X and Y are independent, and
  • whenever X and Y are such that either one is completely determined by the other (i.e. by a bijection);

where is a measure over these sets, and we set:

we find that Shannon's "measure" of information content satisfies all the postulates and basic properties of a formal measure over sets. This can be a handy mnemonic device in some situations. Certain extensions to the definitions of Shannon's basic measures of information are necessary to deal with the σ-algebra generated by the sets that would be associated to three or more arbitrary random variables. (See Reza pp. 106-108 for an informal but rather complete discussion.) Namely needs to be defined in the obvious way as the entropy of a joint distribution, and an extended transinformation defined in a suitable manner (left as an exercise for the ambitious reader) so that we can set:

in order to define the (signed) measure over the whole σ-algebra. (It is interesting to note that the mutual information of three or more random variables can be negative as well as positive: Let X and Y be two independent fair coin flips, and let Z be their exclusive or. Then bit.)

This formulation reiterates and clarifies the fundamental properties of these basic concepts of information theory.