Jump to content

Colour refinement algorithm

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Vinallha (talk | contribs) at 16:06, 18 January 2024 (Description: improved and expanded the description of the algorithm). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In graph theory and theoretical computer science, the colour refinement algorithm also known as the naive vertex classification, or the 1-dimensional version of the Weisfeiler-Leman algorithm, is a routine used for testing whether two graphs are isomorphic.[1]

History

Description

The algorithm takes as an input a graph with vertices. It proceeds in iterations and in each iteration we produce a new colouring of the vertices. Formally a "colouring" is a function from the vertices of this graph into some set (of "colours"). In each iteration, we define a sequence of vertex colourings as follows:

  • is the initial colouring. If the graph is unlabeled, the initial colouring assigns a trivial colour to each vertex . If the graph is labelled, is the label of vertex .
  • For all vertices , we set .

In other words, the new colour of the vertex is the pair formed from the previous colour and the multiset of the colours of its neighbours. This algorithm keeps refining the current colouring. At some point it stabilises, i.e., . This final colouring is called the stable colouring.

Expressivity

There are simple examples of graphs that are not distinguished by colour refinement. For example, it does not distinguish a cycle of length 6 from a pair of triangles (example V.1 in [2]). Despite this, the algorithm is very powerful in that a random graph will be identified by the algorithm asymptotically almost surely [3]. Even stronger, it has been shown that as increases, the proportion of graphs that are not identified by colour refinement decreases exponentially in order [4].

Complexity

The stable colouring is computable in O((n+m)log n) where n is the number of vertices and m the number of edges.[5] This complexity has been proven to be optimal under reasonable assumptions.[6]

References

  1. ^ Grohe, Martin; Kersting, Kristian; Mladenov, Martin; Schweitzer, Pascal (2021). "Color Refinement and Its Applications". An Introduction to Lifted Probabilistic Inference. doi:10.7551/mitpress/10548.003.0023. ISBN 9780262365598. S2CID 59069015.
  2. ^ Grohe, Martin (2021-06-29). "The Logic of Graph Neural Networks". 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). LICS '21. New York, NY, USA: Association for Computing Machinery. pp. 1–17. arXiv:2104.14624. doi:10.1109/LICS52264.2021.9470677. ISBN 978-1-6654-4895-6. S2CID 233476550.
  3. ^ Babai, László; Erdo˝s, Paul; Selkow, Stanley M. (August 1980). "Random Graph Isomorphism". SIAM Journal on Computing. 9 (3): 628–635. doi:10.1137/0209047. ISSN 0097-5397.
  4. ^ Canonical labelling of graphs in linear average time | IEEE Conference Publication | IEEE Xplore.
  5. ^ Cardon, A.; Crochemore, M. (1982-07-01). "Partitioning a graph in O(¦A¦log2¦V¦)". Theoretical Computer Science. 19 (1): 85–98. doi:10.1016/0304-3975(82)90016-0. ISSN 0304-3975.
  6. ^ Berkholz, Christoph; Bonsma, Paul; Grohe, Martin (2017-05-01). "Tight Lower and Upper Bounds for the Complexity of Canonical Colour Refinement". Theory of Computing Systems. 60 (4): 581–614. arXiv:1509.08251. doi:10.1007/s00224-016-9686-0. ISSN 1433-0490. S2CID 12616856.