Jump to content

User:Ensemblized/sandbox

From Wikipedia, the free encyclopedia

The degree-corrected stochastic block model is a generalization of the stochastic block model, a generative model for random graphs, taking into consideration the broad degree distribution of empirical networks. This model tends to produce graphs containing blocks, subsets of nodes characterized by having the same statistical tendencies, in addition to having an arbitrary degree distribution within each block. For example, edges may be more common within communities than between communities, and within communities different activity rates (degree) could exist.

The current formulation of the degree-corrected stochastic block model was introduced in 2011 in the field of network science by Karrer and Newman.[1] The degree-corrected stochastic block model outperforms the simple stochastic degree-corrected in the task of community detection.

Definition

[edit]

The stochastic block model takes the following parameters:

  • The number of vertices;
  • a partition of the vertex set into disjoint subsets , called blocks;
  • a symmetric matrix indicating the interaction intensity between the nodes from the respective block;
  • The number of expected degrees of vertices .

The expected value of the adjacency matrix element , indicating the probability of connection between nodes and would be where respectively denote the block memberships of nodes and . This probability incorporates both the activity rates (average degrees) of each node and also the interaction intensity based on their block membership.

Special cases

[edit]

In case the activity rates are constant and uniform across all the nodes making up the network, the model would be identical to a regular Stochastic Block Model where the matrix would be reduced to the probability of connection between the two blocks.

If the interaction between all groups is uniform (), the model will be reduced to a configuration model where indicate the outgoing stubs/degree of each node.

In case both and are constant, then the result is the Erdős–Rényi model . This case is degenerate—the partition into communities becomes irrelevant—but it illustrates a close relationship to the Erdős–Rényi model. [2]

Detection

[edit]

Performance

[edit]

In networks with highly homogeneous degree distributions, there is little to no advantage in using the degree corrected version of the stochastic block model instead of the regular version. This is expected as the extra variables in the degree corrected model would only alter the distribution in case of degree homogeneity.

However, in networks with high degree heterogeneity, the degree corrected stochastic block model has a clear advantage. For example, the regular stochastic block model is not capable of capturing the correct blocks of the famous Zachary Karate Club network [3][4][5] but Newman and Karrer demonstrate that the degree stochastic distinguishes the underlying communities of this network with high precision. [1]

This model exhibits asymptotic consistency, meaning that it can correctly infer the blocks in a network produced by a generative stochastic block model. The degree corrected stochastic block model also outperforms the regular version in inferring the blocks of synthetic networks.

Algorithms

[edit]

A diverse set of algorithms can be employed for the degree corrected stochastic block model inference problem. Popular algorithms include spectral clustering of the vertices,[6][7][8][9] semidefinite programming,[10][11] forms of belief propagation,[12][13] and community detection[14] among others.

The recovery of blocks in the model can be done using the principle of maximum likelihood, but this amounts to solving a constrained or regularized cut problem such as minimum bisection that is typically NP-complete. Hence, no known efficient algorithms will correctly compute the maximum-likelihood estimate in the worst case.

Similar Methods

[edit]

Several other methods have been developed to address the degree distribution heterogeneity of empirical networks. Some methods devise more complex exponential random graph models, but are not as analytically tractable as the stochastic block model. Other attempts allow overlapping blocks where nodes can be members of several blocks [15], or mixed memberships, introducing membership probability-like vectors.[16]

Directed stochastic block models with arbitrary expected in-degree and out-degree have also been formalized, however, these models are not solvable in their closed form, limiting their applicability. [17] Degree heterogeneity has also been incorporated in diverse forms. [18][19][20]

See also

[edit]

References

[edit]
  1. ^ a b Karrer, Brian; Newman, M. E. J. (2011-01-21). "Stochastic blockmodels and community structure in networks". Physical Review E. 83 (1). doi:10.1103/physreve.83.016107. ISSN 1539-3755.
  2. ^ Newman, Mark (2010-03-25). Networks. Oxford University Press. ISBN 978-0-19-920665-0.
  3. ^ Zachary, Wayne W. (1977-12). "An Information Flow Model for Conflict and Fission in Small Groups". Journal of Anthropological Research. 33 (4): 452–473. doi:10.1086/jar.33.4.3629752. ISSN 0091-7710. {{cite journal}}: Check date values in: |date= (help)
  4. ^ Bickel, Peter J.; Chen, Aiyou (2009-11-23). "A nonparametric view of network models and Newman–Girvan and other modularities". Proceedings of the National Academy of Sciences. 106 (50): 21068–21073. doi:10.1073/pnas.0907096106. ISSN 0027-8424.
  5. ^ Rosvall, M.; Bergstrom, C. T. (2008-01-23). "Maps of random walks on complex networks reveal community structure". Proceedings of the National Academy of Sciences. 105 (4): 1118–1123. doi:10.1073/pnas.0706851105. ISSN 0027-8424.
  6. ^ Krzakala, Florent; Moore, Cristopher; Mossel, Elchanan; Neeman, Joe; Sly, Allan; Lenka, Lenka; Zhang, Pan (October 2013). "Spectral redemption in clustering sparse networks". Proceedings of the National Academy of Sciences. 110 (52): 20935–20940. arXiv:1306.5550. Bibcode:2013PNAS..11020935K. doi:10.1073/pnas.1312486110. PMC 3876200. PMID 24277835.
  7. ^ Massoulie, Laurent (November 2013). "Community detection thresholds and the weak Ramanujan property". arXiv:1311.3085 [cs.SI].
  8. ^ Abbe, Emmanuel; Sandon, Colin (March 2015). "Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms". arXiv:1503.00609 [math.PR].
  9. ^ Lei, Jing; Rinaldo, Alessandro (February 2015). "Consistency of spectral clustering in stochastic block models". The Annals of Statistics. 43 (1): 215–237. arXiv:1312.2050. doi:10.1214/14-AOS1274. ISSN 0090-5364. S2CID 88519551.
  10. ^ Amini, Arash A.; Levina, Elizaveta (June 2014). "On semidefinite relaxations for the block model". arXiv:1406.5647 [cs.LG].
  11. ^ Abbe, Emmanuel; Bandeira, Afonso S.; Hall, Georgina (May 2014). "Exact Recovery in the Stochastic Block Model". arXiv:1405.3267 [cs.SI].
  12. ^ Decelle, Aurelien; Krzakala, Florent; Moore, Cristopher; Zdeborová, Lenka (September 2011). "Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications". Physical Review E. 84 (6): 066106. arXiv:1109.3041. Bibcode:2011PhRvE..84f6106D. doi:10.1103/PhysRevE.84.066106. PMID 22304154. S2CID 15788070.
  13. ^ Mossel, Elchanan; Neeman, Joe; Sly, Allan (September 2013). "Belief Propagation, Robust Reconstruction, and Optimal Recovery of Block Models". The Annals of Applied Probability. 26 (4): 2211–2256. arXiv:1309.1380. Bibcode:2013arXiv1309.1380M. doi:10.1214/15-AAP1145. S2CID 184446.
  14. ^ Fathi, Reza (April 2019). "Efficient Distributed Community Detection in the Stochastic Block Model". arXiv:1904.07494 [cs.DC].
  15. ^ Latouche, Pierre; Birmelé, Etienne; Ambroise, Christophe (2011-03-01). "Overlapping stochastic block models with application to the French political blogosphere". The Annals of Applied Statistics. 5 (1). doi:10.1214/10-aoas382. ISSN 1932-6157.
  16. ^ M., Airoldi, Edoardo (2014). Handbook of Mixed Membership Models and Their Applications. CRC Press. ISBN 1-322-62245-0. OCLC 901265698.{{cite book}}: CS1 maint: multiple names: authors list (link)
  17. ^ Wang, Yuchung J.; Wong, George Y. (1987-03). "Stochastic Blockmodels for Directed Graphs". Journal of the American Statistical Association. 82 (397): 8–19. doi:10.1080/01621459.1987.10478385. ISSN 0162-1459. {{cite journal}}: Check date values in: |date= (help)
  18. ^ Reichardt, J.; White, D. R. (2007-11). "Role models for complex networks". The European Physical Journal B. 60 (2): 217–224. doi:10.1140/epjb/e2007-00340-y. ISSN 1434-6028. {{cite journal}}: Check date values in: |date= (help)
  19. ^ Newman, M. E. J.; Leicht, E. A. (2007-05-24). "Mixture models and exploratory analysis in networks". Proceedings of the National Academy of Sciences. 104 (23): 9564–9569. doi:10.1073/pnas.0610537104. ISSN 0027-8424.
  20. ^ Coja-Oghlan, Amin; Lanka, André (2010-01). "Finding Planted Partitions in Random Graphs with General Degree Distributions". SIAM Journal on Discrete Mathematics. 23 (4): 1682–1714. doi:10.1137/070699354. ISSN 0895-4801. {{cite journal}}: Check date values in: |date= (help)

Cite error: A list-defined reference named "as15b" is not used in the content (see the help page).
Cite error: A list-defined reference named "gbm" is not used in the content (see the help page).
Cite error: A list-defined reference named "mns12" is not used in the content (see the help page).
Cite error: A list-defined reference named "ar1" is not used in the content (see the help page).
Cite error: A list-defined reference named "hol" is not used in the content (see the help page).
Cite error: A list-defined reference named "ker" is not used in the content (see the help page).

Cite error: A list-defined reference named "pei" is not used in the content (see the help page).

Category:Machine learning Category:Random graphs Category:Networks