Colour refinement algorithm

In graph theory and theoretical computer science, the colour refinement algorithm also known as the naive vertex classification, or the 1-dimensional version of the Weisfeiler-Leman algorithm, is a routine used for testing whether two graphs are isomorphic.^[1]

History

Description

The algorithm takes as an input a graph $G$ with $n$ vertices. It proceeds in iterations and in each iteration we produce a new colouring of the vertices. Formally a "colouring" is a function from the vertices of this graph into some set (of "colours"). In each iteration, we define a sequence of vertex colourings $\lambda _{i}$ as follows:

$\lambda _{0}$ is the initial colouring. If the graph is unlabeled, the initial colouring assigns a trivial colour $\lambda _{0}(v)$ to each vertex $v$ . If the graph is labelled, $\lambda _{0}$ is the label of vertex $v$ .
For all vertices $v$ , we set $\lambda _{i+1}=\left(\lambda _{i}(v),\{\{\lambda _{i}(w)\mid w{\text{ is a neighbor of }}v\}\}\right)$ .

In other words, the new colour of the vertex $v$ is the pair formed from the previous colour and the multiset of the colours of its neighbours. This algorithm keeps refining the current colouring. At some point it stabilises, i.e., $\lambda _{i+1}=\lambda _{i}$ . This final colouring is called the stable colouring.

Expressivity

There are simple examples of graphs that are not distinguished by colour refinement. For example, it does not distinguish a cycle of length 6 from a pair of triangles (example V.1 in ^[2]). Despite this, the algorithm is very powerful in that a random graph will be identified by the algorithm asymptotically almost surely ^[3]. Even stronger, it has been shown that as $n$ increases, the proportion of graphs that are not identified by colour refinement decreases exponentially in order $n$ ^[4].

Complexity

The stable colouring is computable in O((n+m)log n) where n is the number of vertices and m the number of edges.^[5] This complexity has been proven to be optimal under reasonable assumptions.^[6]

References

^ Grohe, Martin; Kersting, Kristian; Mladenov, Martin; Schweitzer, Pascal (2021). "Color Refinement and Its Applications". An Introduction to Lifted Probabilistic Inference. doi:10.7551/mitpress/10548.003.0023. ISBN 9780262365598. S2CID 59069015.
^ Grohe, Martin (2021-06-29). "The Logic of Graph Neural Networks". 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). LICS '21. New York, NY, USA: Association for Computing Machinery. pp. 1–17. arXiv:2104.14624. doi:10.1109/LICS52264.2021.9470677. ISBN 978-1-6654-4895-6. S2CID 233476550.
^ Babai, László; Erdo˝s, Paul; Selkow, Stanley M. (August 1980). "Random Graph Isomorphism". SIAM Journal on Computing. 9 (3): 628–635. doi:10.1137/0209047. ISSN 0097-5397.
^ Canonical labelling of graphs in linear average time | IEEE Conference Publication | IEEE Xplore.
^ Cardon, A.; Crochemore, M. (1982-07-01). "Partitioning a graph in O(¦A¦log2¦V¦)". Theoretical Computer Science. 19 (1): 85–98. doi:10.1016/0304-3975(82)90016-0. ISSN 0304-3975.
^ Berkholz, Christoph; Bonsma, Paul; Grohe, Martin (2017-05-01). "Tight Lower and Upper Bounds for the Complexity of Canonical Colour Refinement". Theory of Computing Systems. 60 (4): 581–614. arXiv:1509.08251. doi:10.1007/s00224-016-9686-0. ISSN 1433-0490. S2CID 12616856.

[1] Grohe, Martin; Kersting, Kristian; Mladenov, Martin; Schweitzer, Pascal (2021). "Color Refinement and Its Applications". An Introduction to Lifted Probabilistic Inference. doi:10.7551/mitpress/10548.003.0023. ISBN 9780262365598. S2CID 59069015.

[2] Grohe, Martin (2021-06-29). "The Logic of Graph Neural Networks". 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). LICS '21. New York, NY, USA: Association for Computing Machinery. pp. 1–17. arXiv:2104.14624. doi:10.1109/LICS52264.2021.9470677. ISBN 978-1-6654-4895-6. S2CID 233476550.

[3] Babai, László; Erdo˝s, Paul; Selkow, Stanley M. (August 1980). "Random Graph Isomorphism". SIAM Journal on Computing. 9 (3): 628–635. doi:10.1137/0209047. ISSN 0097-5397.

[4] Canonical labelling of graphs in linear average time | IEEE Conference Publication | IEEE Xplore.

[5] Cardon, A.; Crochemore, M. (1982-07-01). "Partitioning a graph in O(¦A¦log2¦V¦)". Theoretical Computer Science. 19 (1): 85–98. doi:10.1016/0304-3975(82)90016-0. ISSN 0304-3975.

[6] Berkholz, Christoph; Bonsma, Paul; Grohe, Martin (2017-05-01). "Tight Lower and Upper Bounds for the Complexity of Canonical Colour Refinement". Theory of Computing Systems. 60 (4): 581–614. arXiv:1509.08251. doi:10.1007/s00224-016-9686-0. ISSN 1433-0490. S2CID 12616856.

[1]

[2]

[3]

[4]

[5]

[6]