User:XORnado/sandbox

This is the user sandbox of XORnado. A user sandbox is a subpage of the user's user page. It serves as a testing spot and page development space for the user and is not an encyclopedia article. Create or edit your own sandbox here.

Other sandboxes: Main sandbox | Template sandbox

Finished writing a draft article? Are you ready to request review of it by an experienced editor for possible inclusion in Wikipedia? Submit your draft for review!

Partial Information Decomposition is an extension of information theory, that aims to generalize the pairwise relations described by information theory to the interaction of multiple variables.^[1]

Motivation

Information theory can quantify the amount of information a single source variable $X_{1}$ has about a target variable $Y$ via the mutual information $I(X_{1};Y)$ . If we now consider a second source variable $X_{2}$ , classical information theory can only describe the mutual information of the joint variable $\{X_{1},X_{2}\}$ with $Y$ , given by $I(X_{1},X_{2};Y)$ . In general however, it would be interesting to know how exactly the individual variables $X_{1}$ and $X_{2}$ and their interactions relate to $Y$ .

Consider that we are given two source variables $X_{1},X_{2}\in \{0,1\}$ and a target variable $Y=XOR(X_{1},X_{2})$ . In this case the total mutual information $I(X_{1},X_{2};Y)=1$ , while the individual mutual information $I(X_{1};Y)=I(X_{2};Y)=0$ . That is, there is synergistic information arising from the interaction of $X_{1},X_{2}$ about $Y$ , which cannot be easily captured with classical information theoretic quantities.

Definition for two source variables

Partial information decomposition further decomposes the mutual information between the source variables $\{X_{1},X_{2}\}$ with the target variable $Y$ as

$I(X_{1},X_{2};Y)={\text{Unq}}(X_{1};Y\setminus X_{2})+{\text{Unq}}(X_{2};Y\setminus X_{1})+{\text{Syn}}(X_{1},X_{2};Y)+{\text{Red}}(X_{1},X_{2};Y)$

Here the individual components, called "information atoms", are defined as

${\text{Unq}}(X_{1};Y\setminus X_{2})$ is the unique information that $X_{1}$ has about $Y$ , which is not in $X_{2}$
${\text{Unq}}(X_{2};Y\setminus X_{2})$ is the unique information that $X_{2}$ has about $Y$ , which is not in $X_{1}$
${\text{Syn}}(X_{1},X_{2};Y)$ is the synergistic information that is in the interaction of $X_{1}$ and $X_{2}$ about $Y$
${\text{Red}}(X_{1},X_{2};Y)$ is the redundant information that is in both $X_{1}$ or $X_{2}$ about $Y$

There is, thus far, no universal agreement on how these terms should be defined, with different approaches that decompose information into redundant, unique, and synergistic components appearing in the literature.^[1]^[2]^[3]^[4]

General definition

A key principle for generalizing PID to $n$ source variables is the notion of containment of information within different subsets of those variables.^[5] For $n=2$ , there are three relevant subsets of source indices: $\{1\}$ , $\{2\}$ , and $\{1,2\}$ . Every possible way of how information about the target $Y$ could distributed over subsets of sources can be represented in a containment table, as shown below. This table also includes the empty set $\{\}$ for completeness.

**Containment table for** $n=2$ , **showing four distinct information atoms.**
Type	$\{S_{1}\}$	$\{S_{2}\}$	$\{S_{1},S_{2}\}$
Redundancy	1	1	1
Unique to $X_{1}$	1	0	1
Unique to $X_{2}$	0	1	1
Synergy	0	0	1

In this table, a 1 indicates that the information atom is contained in that subset of source variables, while a 0 indicates that it is not. Three main constraints ensure the table captures meaningful information relationships:

Empty-set constraint: There is no information in the empty set, so the first column must always be 0.
Full-set constraint: All the information available from the sources is contained in the full set, so the last column must always be 1.
Monotonicity: If an atom is contained in some subset of sources, it must also be contained in any superset of that subset (the table entries cannot “go back” from 1 to 0 as you move to larger sets).

To handle arbitrary $n$ sources, one constructs a larger containment table under these same constraints. Each row corresponds to an information atom characterized by how it is contained in the different subsets of source variables. Mathematically, each row can be seen as a monotonic Boolean function from the power set of sources to {0,1}, and the number of such functions is given by the Dedekind numbers (minus two, since the constant-zero and constant-one functions are prohibited by the first two constraints).

It is conventional to label an information atom by the smallest subsets (in terms of subset relations) from which it can be obtained. This is sufficient because due to the monotonicity constraint it is implied that the atom must be contained in all larger sets as well. For $n=2$ , for example:

The redundancy atom is denoted $\Pi (\{1\}\{2\})$ .
The unique atoms are $\Pi (\{1\})$ and $\Pi (\{2\})$ .
The synergy atom is $\Pi (\{1,2\})$ .

Generally, the sets listed in the argument of $\Pi (\ldots )$ form a so called antichain, i.e. a collection of sets where no set is a subset of another one. A generic antichain is denoted by the letter $\alpha$ and the information atom corresponding to it by $\Pi (\alpha )$ .

The full containment table for $n=3$ showing 18 distinct atoms and their antichain representation is shown in the following collapsible table:

**Containment table for** $n=3$
#	Atom $\Pi (\alpha )$	$\{S_{1}\}$	$\{S_{2}\}$	$\{S_{3}\}$	$\{S_{1},S_{2}\}$	$\{S_{1},S_{3}\}$	$\{S_{2},S_{3}\}$	$\{S_{1},S_{2},S_{3}\}$
1	$\Pi (\{1,2,3\})$	0	0	0	0	0	0	1
2	$\Pi (\{1\})$	1	0	0	1	1	0	1
3	$\Pi (\{2\})$	0	1	0	1	0	1	1
4	$\Pi (\{3\})$	0	0	1	0	1	1	1
5	$\Pi (\{1,2\})$	0	0	0	1	0	0	1
6	$\Pi (\{1,3\})$	0	0	0	0	1	0	1
7	$\Pi (\{2,3\})$	0	0	0	0	0	1	1
8	$\Pi (\{1\}\{2\})$	1	1	0	1	1	1	1
9	$\Pi (\{1\}\{3\})$	1	0	1	1	1	1	1
10	$\Pi (\{2\}\{3\})$	0	1	1	1	1	1	1
11	$\Pi (\{1\}\{2,3\})$	1	0	0	1	1	1	1
12	$\Pi (\{2\}\{1,3\})$	0	1	0	1	1	1	1
13	$\Pi (\{3\}\{1,2\})$	0	0	1	1	1	1	1
14	$\Pi (\{1,2\}\{1,3\})$	0	0	0	1	1	0	1
15	$\Pi (\{1,2\}\{2,3\})$	0	0	0	1	0	1	1
16	$\Pi (\{1,3\}\{2,3\})$	0	0	0	0	1	1	1
17	$\Pi (\{1\}\{2\}\{3\})$	1	1	1	1	1	1	1
18	$\Pi (\{1,2\}\{1,3\}\{2,3\})$	0	0	0	1	1	1	1

For an arbitrary set of $n$ sources $X_{1},\ldots ,X_{n}$ , a partial information decomposition is a collection of atoms $\Pi (\alpha )$ , one per antichain, so that for every subset $\mathbf {b} \subseteq \{1,\ldots ,n\}$ , the following condition is satisfied

$I(\mathbf {X} ^{\mathbf {b} };Y)\;=\;\sum _{\exists \,\mathbf {a} \in \alpha :\mathbf {a} \subseteq \mathbf {b} }\Pi (\alpha ).$

where $I(\mathbf {X} ^{\mathbf {b} };Y)$ is the mutual information provided by the sources indexed by $\mathbf {b}$ . Intuitively, this condition simply says that the mutual information provided by $\mathbf {b}$ should be the sum of atoms that are contained in it, i.e. those with a 1 on $\mathbf {b}$ in the containment table. By construction, these are exactly the atoms $\Pi (\alpha )$ with a subset $\mathbf {a} \subseteq \mathbf {b}$ in their antichain representation.

For $\mathbf {b} =\{1,\dots ,n\}$ , the condition states that all atoms have to sum up to the joint mutual information. This is because the sets contained in any $\alpha$ are always subsets of the full set $\{1,\dots ,n\}$ and therefore the condition under the summation sign is always fulfilled.

The condition above specifies a linear system of equations relating the information atoms to mutual information. However, since the number of subsets of sources $\mathbf {X} ^{\mathbf {b} }$ grows exponentially in $n$ while the number of information atoms grows super-exponentially, the system becomes more and more underdetermined as $n$ grows. Already for $n=2$ it provides three equations for four unknowns. Hence, there is no unique solution for the information atoms $\Pi (\alpha )$ .

Various proposals have been made to resolve this underdetermination by imposing additional constraints on the information atoms. Typically, these constraints tie the atoms to other information-theoretic quantities besides the mutual information. For instance, the original approach by Williams and Beer ^[1] introduced a "redundancy function," while other authors have used "union information" or synergy functions. Importantly, these quantities are in general (i.e. for $n>2$ ) themselves composed of multiple information atoms and should not be confused with a single "redundancy atom" or "synergy atom."

Applications

Despite the lack of universal agreement, partial information decomposition has been applied to diverse fields, including climatology,^[6] neuroscience^[7]^[8]^[9] sociology,^[10] and machine learning^[11] Partial information decomposition has also been proposed as a possible foundation on which to build a mathematically robust definition of emergence in complex systems^[12] and may be relevant to formal theories of consciousness.^[13]

References

^ ^a ^b ^c Williams PL, Beer RD (2010-04-14). "Nonnegative Decomposition of Multivariate Information". arXiv:1004.2515 [cs.IT].
^ Quax R, Har-Shemesh O, Sloot PM (February 2017). "Quantifying Synergistic Information Using Intermediate Stochastic Variables". Entropy. 19 (2): 85. arXiv:1602.01265. doi:10.3390/e19020085. ISSN 1099-4300.
^ Rosas FE, Mediano PA, Rassouli B, Barrett AB (2020-12-04). "An operational information decomposition via synergistic disclosure". Journal of Physics A: Mathematical and Theoretical. 53 (48): 485001. arXiv:2001.10387. Bibcode:2020JPhA...53V5001R. doi:10.1088/1751-8121/abb723. ISSN 1751-8113. S2CID 210932609.
^ Kolchinsky A (March 2022). "A Novel Approach to the Partial Information Decomposition". Entropy. 24 (3): 403. arXiv:1908.08642. Bibcode:2022Entrp..24..403K. doi:10.3390/e24030403. PMC 8947370. PMID 35327914.
^ Gutknecht AJ, Wibral M, Makkeh A (2021). "Bits and Pieces: Understanding Information Decomposition from Part-Whole Relationships and Formal Logic". Proceedings of the Royal Society A. 477 (2251): 20210110. arXiv:2101.03968. doi:10.1098/rspa.2021.0110.
^ Goodwell AE, Jiang P, Ruddell BL, Kumar P (February 2020). "Debates—Does Information Theory Provide a New Paradigm for Earth Science? Causality, Interaction, and Feedback". Water Resources Research. 56 (2). Bibcode:2020WRR....5624940G. doi:10.1029/2019WR024940. ISSN 0043-1397. S2CID 216201598.
^ Newman EL, Varley TF, Parakkattu VK, Sherrill SP, Beggs JM (July 2022). "Revealing the Dynamics of Neural Information Processing with Multivariate Information Decomposition". Entropy. 24 (7): 930. Bibcode:2022Entrp..24..930N. doi:10.3390/e24070930. PMC 9319160. PMID 35885153.
^ Luppi AI, Mediano PA, Rosas FE, Holland N, Fryer TD, O'Brien JT, et al. (June 2022). "A synergistic core for human brain evolution and cognition". Nature Neuroscience. 25 (6): 771–782. doi:10.1038/s41593-022-01070-0. PMC 7614771. PMID 35618951. S2CID 249096746.
^ Wibral M, Priesemann V, Kay JW, Lizier JT, Phillips WA (March 2017). "Partial information decomposition as a unified approach to the specification of neural goal functions". Brain and Cognition. Perspectives on Human Probabilistic Inferences and the 'Bayesian Brain'. 112: 25–38. arXiv:1510.00831. doi:10.1016/j.bandc.2015.09.004. PMID 26475739. S2CID 4394452.
^ Varley TF, Kaminski P (October 2022). "Untangling Synergistic Effects of Intersecting Social Identities with Partial Information Decomposition". Entropy. 24 (10): 1387. Bibcode:2022Entrp..24.1387V. doi:10.3390/e24101387. ISSN 1099-4300. PMC 9611752. PMID 37420406.
^ Tax TM, Mediano PA, Shanahan M (September 2017). "The Partial Information Decomposition of Generative Neural Network Models". Entropy. 19 (9): 474. Bibcode:2017Entrp..19..474T. doi:10.3390/e19090474. hdl:10044/1/50586. ISSN 1099-4300.
^ Mediano PA, Rosas FE, Luppi AI, Jensen HJ, Seth AK, Barrett AB, et al. (July 2022). "Greater than the parts: a review of the information decomposition approach to causal emergence". Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences. 380 (2227): 20210246. doi:10.1098/rsta.2021.0246. PMC 9125226. PMID 35599558.
^ Luppi AI, Mediano PA, Rosas FE, Harrison DJ, Carhart-Harris RL, Bor D, Stamatakis EA (2021). "What it is like to be a bit: an integrated information decomposition account of emergent mental phenomena". Neuroscience of Consciousness. 2021 (2): niab027. doi:10.1093/nc/niab027. PMC 8600547. PMID 34804593.

[williamsbeer-1] Williams PL, Beer RD (2010-04-14). "Nonnegative Decomposition of Multivariate Information". arXiv:1004.2515 [cs.IT].

[2] Quax R, Har-Shemesh O, Sloot PM (February 2017). "Quantifying Synergistic Information Using Intermediate Stochastic Variables". Entropy. 19 (2): 85. arXiv:1602.01265. doi:10.3390/e19020085. ISSN 1099-4300.

[3] Rosas FE, Mediano PA, Rassouli B, Barrett AB (2020-12-04). "An operational information decomposition via synergistic disclosure". Journal of Physics A: Mathematical and Theoretical. 53 (48): 485001. arXiv:2001.10387. Bibcode:2020JPhA...53V5001R. doi:10.1088/1751-8121/abb723. ISSN 1751-8113. S2CID 210932609.

[4] Kolchinsky A (March 2022). "A Novel Approach to the Partial Information Decomposition". Entropy. 24 (3): 403. arXiv:1908.08642. Bibcode:2022Entrp..24..403K. doi:10.3390/e24030403. PMC 8947370. PMID 35327914.

[5] Gutknecht AJ, Wibral M, Makkeh A (2021). "Bits and Pieces: Understanding Information Decomposition from Part-Whole Relationships and Formal Logic". Proceedings of the Royal Society A. 477 (2251): 20210110. arXiv:2101.03968. doi:10.1098/rspa.2021.0110.

[6] Goodwell AE, Jiang P, Ruddell BL, Kumar P (February 2020). "Debates—Does Information Theory Provide a New Paradigm for Earth Science? Causality, Interaction, and Feedback". Water Resources Research. 56 (2). Bibcode:2020WRR....5624940G. doi:10.1029/2019WR024940. ISSN 0043-1397. S2CID 216201598.

[7] Newman EL, Varley TF, Parakkattu VK, Sherrill SP, Beggs JM (July 2022). "Revealing the Dynamics of Neural Information Processing with Multivariate Information Decomposition". Entropy. 24 (7): 930. Bibcode:2022Entrp..24..930N. doi:10.3390/e24070930. PMC 9319160. PMID 35885153.

[8] Luppi AI, Mediano PA, Rosas FE, Holland N, Fryer TD, O'Brien JT, et al. (June 2022). "A synergistic core for human brain evolution and cognition". Nature Neuroscience. 25 (6): 771–782. doi:10.1038/s41593-022-01070-0. PMC 7614771. PMID 35618951. S2CID 249096746.

[9] Wibral M, Priesemann V, Kay JW, Lizier JT, Phillips WA (March 2017). "Partial information decomposition as a unified approach to the specification of neural goal functions". Brain and Cognition. Perspectives on Human Probabilistic Inferences and the 'Bayesian Brain'. 112: 25–38. arXiv:1510.00831. doi:10.1016/j.bandc.2015.09.004. PMID 26475739. S2CID 4394452.

[10] Varley TF, Kaminski P (October 2022). "Untangling Synergistic Effects of Intersecting Social Identities with Partial Information Decomposition". Entropy. 24 (10): 1387. Bibcode:2022Entrp..24.1387V. doi:10.3390/e24101387. ISSN 1099-4300. PMC 9611752. PMID 37420406.

[11] Tax TM, Mediano PA, Shanahan M (September 2017). "The Partial Information Decomposition of Generative Neural Network Models". Entropy. 19 (9): 474. Bibcode:2017Entrp..19..474T. doi:10.3390/e19090474. hdl:10044/1/50586. ISSN 1099-4300.

[12] Mediano PA, Rosas FE, Luppi AI, Jensen HJ, Seth AK, Barrett AB, et al. (July 2022). "Greater than the parts: a review of the information decomposition approach to causal emergence". Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences. 380 (2227): 20210246. doi:10.1098/rsta.2021.0246. PMC 9125226. PMID 35599558.

[13] Luppi AI, Mediano PA, Rosas FE, Harrison DJ, Carhart-Harris RL, Bor D, Stamatakis EA (2021). "What it is like to be a bit: an integrated information decomposition account of emergent mental phenomena". Neuroscience of Consciousness. 2021 (2): niab027. doi:10.1093/nc/niab027. PMC 8600547. PMID 34804593.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

Motivation

Definition for two source variables

General definition

Applications

See also

References