Permutation matrix

In mathematics, particularly in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry of 1 in each row and each column with all other entries 0. Such a matrix $P$ , say of size $n \times n$ , can represent a permutation of $n$ elements. Pre-multiplying an $n$ -row matrix $M$ by such a permutation matrix, forming $PM$ , results in permuting the rows of $M$ , while post-multiplying an $n$ -column matrix $M$ , forming $MP$ , permutes the columns of $M$ .

Every permutation matrix P is orthogonal, with its inverse equal to its transpose: $P^{-1}=P^{\mathsf {T}}$ (proved below). Indeed, permutation matrices can be characterized as the orthogonal matrices whose entries are all non-negative.^[1] (Thinking geometrically, the only way to fit n orthogonal unit vectors into the nonnegative orthant of Euclidean n-space is to have each vector point forward along one of the coordinate axes, which can be done in $n!$ ways.)

The two permutation/matrix correspondences

There are two different one-to-one correspondences between permutations and permutation matrices, one of which works along the rows of the matrix, the other along its columns. Here is an example, starting with a permutation $π$ in two-line form at the upper left:

{\begin{matrix}\pi \colon {\begin{pmatrix}1&2&3&4\\3&2&4&1\end{pmatrix}}&\longleftrightarrow &R_{\pi }\colon {\begin{pmatrix}0&0&1&0\\0&1&0&0\\0&0&0&1\\1&0&0&0\end{pmatrix}}\\[5pt]{\Big \updownarrow }&&{\Big \updownarrow }\\[5pt]C_{\pi }\colon {\begin{pmatrix}0&0&0&1\\0&1&0&0\\1&0&0&0\\0&0&1&0\end{pmatrix}}&\longleftrightarrow &\pi ^{-1}\colon {\begin{pmatrix}1&2&3&4\\4&2&1&3\end{pmatrix}}\end{matrix}}

The row-based correspondence takes the permutation $π$ to the matrix $R_{\pi }$ at the upper right. The first row of $R_{\pi }$ has its 1 in the third column because $\pi (1)=3$ . More generally, we have $R_{\pi }=(r_{ij})$ where $r_{ij}=1$ when $j=\pi (i)$ and $r_{ij}=0$ otherwise.

The column-based correspondence takes $π$ to the matrix $C_{\pi }$ at the lower left. The first column of $C_{\pi }$ has its 1 in the third row because $\pi (1)=3$ . More generally, we have $C_{\pi }=(c_{ij})$ where $c_{ij}$ is 1 when $i=\pi (j)$ and 0 otherwise. Since the two recipes differ only by swapping i with j, the matrix $C_{\pi }$ is the transpose of $R_{\pi }$ ; and, since $R_{\pi }$ is a permutation matrix, we have $C_{\pi }=R_{\pi }^{\mathsf {T}}=R_{\pi }^{-1}$ . Tracing the other two sides of the big square, we have $R_{\pi ^{-1}}=C_{\pi }=R_{\pi }^{-1}$ and $C_{\pi ^{-1}}=R_{\pi }$ .^[2]

These matrices permute rows or columns

Multiplying a matrix M by either $R_{\pi }$ or $C_{\pi }$ on either the left or the right will permute either the rows or columns of M by either $π$ or $π$ ^-1. The details are a bit tricky.

To begin with, when we permute the entries of a vector $(v_{1},\ldots ,v_{n})$ by some permutation $π$ , we move the $i^{\text{th}}$ entry $v_{i}$ of the input vector into the $\pi (i)^{\text{th}}$ slot of the output vector. Which entry then ends up in, say, the first slot of the output? Answer: The entry $v_{j}$ for which $\pi (j)=1$ , and hence $j=\pi ^{-1}(1)$ . Arguing similarly about each of the slots, we find that the output vector is

{\big (}v_{\pi ^{-1}(1)},v_{\pi ^{-1}(2)},\ldots ,v_{\pi ^{-1}(n)}{\big )},

even though we are permuting by $\pi$ , not by $\pi ^{-1}$ . More generally, permuting by $π$ the rows of an n-row matrix $M=(m_{i,j})$ produces the matrix whose $(i,j)^{\text{th}}$ entry is $m_{\pi ^{-1}(i),j}$ . And permuting by $π$ the columns of an n-column matrix $M=(m_{i,j})$ produces the matrix whose $(i,j)^{\text{th}}$ entry is $m_{i,\pi ^{-1}(j)}$ . (By the way, permuting the entries among the fixed slots in this way is using the alibi viewpoint. If we had instead permuted the labels of the slots while leaving the entries fixed, we would have been using the alias viewpoint, where those two viewpoints differ by yet another inversion of $π$ .^[3])

With that in mind, we can show that pre-multiplying an n-row matrix M by $C_{\pi }$ permutes the rows of M by $π$ . By the rule for matrix multiplication, the $(i,j)^{\text{th}}$ entry of the product $C_{\pi }M$ is given by

\sum _{k=1}^{n}c_{i,k}m_{k,j}.

where $c_{i,k}=1$ when $i=\pi (k)$ and $c_{i,k}=0$ otherwise. Since the only summand that survives is the one with $k=\pi ^{-1}(i)$ , the sum simplifies to $m_{\pi ^{-1}(i),j}$ ; and that establishes our claim. Symmetrically, post-multiplying an n-column matrix M by $R_{\pi }$ permutes the columns of M by $π$ , since the $(i,j)^{\text{th}}$ entry of the product $MR_{\pi }$ is $m_{i,\pi ^{-1}(j)}$ .

The other two options are pre-multiplying by $R_{\pi }$ or post-multiplying by $C_{\pi }$ , and they permute the rows or columns respectively by $π$ ^-1, instead of by $π$ .

The transpose is also the inverse

A related argument shows that, as we claimed at the outset, the transpose of any permutation matrix P also acts as its inverse, thus implying that P is invertible. If $P=(p_{i,j})$ , then the $(i,j)^{\text{th}}$ entry of its transpose $P^{\mathsf {T}}$ is $p_{j,i}$ . The $(i,j)^{\text{th}}$ entry of the product $PP^{\mathsf {T}}$ is then

\sum _{k=1}^{n}p_{i,k}p_{j,k}.

Whenever $i\neq j$ , the $k^{\text{th}}$ term in this sum is the product of two different entries in the $k^{\text{th}}$ column of P; so all terms are 0, and the sum is 0. When $i=j$ , we are summing the squares of the entries in the $i^{\text{th}}$ row of P, so we get 1. The product $PP^{\mathsf {T}}$ is thus the identity matrix. A symmetric argument shows the same for $P^{\mathsf {T}}P$ , implying that P is invertible with $P^{-1}=P^{\mathsf {T}}$ .

Multiplying permutation matrices

Given two permutations of $n$ elements 𝜎 and 𝜏, the product of the corresponding column-based permutation matrices $C σ$ and $C τ$ is given, as you might expect, by $C_{\sigma }C_{\tau }=C_{\sigma \,\circ \,\tau },$ where the composed permutation $\sigma \circ \tau$ applies first 𝜏 and then 𝜎, working from right to left: $(\sigma \circ \tau )(k)=\sigma \left(\tau (k)\right).$ This follows because pre-multiplying some matrix by $C τ$ and then pre-multiplying the resulting product by $C σ$ gives the same result as pre-multiplying just once by the combined $C_{\sigma \,\circ \,\tau }$ .

For the row-based matrices, there is a twist: The product of $R σ$ and $R τ$ is given by

R_{\sigma }R_{\tau }=R_{\tau \,\circ \,\sigma },

with 𝜎 applied before 𝜏 in the composed permutation. This happens because we must post-multiply to avoid inversions under the row-based option, so we would post-multiply first by $R σ$ and then by $R τ$ .

Some people, when applying a function to an argument, write the function after the argument, rather than before it. When doing linear algebra, they work with linear spaces of row vectors, and they apply a linear map to an argument by using the map's matrix to post-multiply the argument's row vector. Often, they also use a left-to-right composition operator, which we here denote using a semicolon; so the composition $\sigma \,;\,\tau$ is defined either by

(\sigma \,;\,\tau )(k)=\tau \left(\sigma (k)\right),

or, more elegantly, by

(k)(\sigma \,;\,\tau )=\left((k)\sigma \right)\tau ,

with 𝜎 applied first. That notation gives us a simpler rule for multiplying row-based permutation matrices:

R_{\sigma }R_{\tau }=R_{\sigma \,;\,\tau }.

Matrix group

When $π$ is the identity permutation, which has $\pi (i)=i$ for all i, both $C π$ and $R π$ are the identity matrix.

The map C is a one-to-one correspondence between permutations and permutation matrices, and R is another such correspondence; since there are $n!$ permutations, there are also $n!$ permutation matrices. By the formulas above, those $n \times n$ permutation matrices form a group of order $n!$ under matrix multiplication, with the identity matrix as its identity element; we denote that group ${\mathcal {P}}_{n}$ . The group ${\mathcal {P}}_{n}$ is a subgroup of the general linear group $GL_{n}(\mathbb {R} )$ of invertible $n \times n$ matrices of real numbers. Indeed, for any field F, the group ${\mathcal {P}}_{n}$ is also a subgroup of the group $GL_{n}(F)$ of invertible $n \times n$ matrices whose entries belong to F.

Let $S_{n}^{\leftarrow }$ denote the symmetric group, or group of permutations, on {1,2,..., $n$ } where the group operation is the standard, right-to-left composition " $\circ$ "; and let $S_{n}^{\rightarrow }$ denote the opposite group, which uses the left-to-right composition " $\,;\,$ ". The map $C\colon S_{n}^{\leftarrow }\to GL_{n}(\mathbb {R} )$ that takes $π$ to its column-based matrix $C_{\pi }$ is a faithful representation, and similarly for the map $R\colon S_{n}^{\rightarrow }\to GL_{n}(\mathbb {R} )$ that takes $π$ to $R_{\pi }$ .

Doubly stochastic matrices

Every permutation matrix is doubly stochastic. The set of all doubly stochastic matrices is called the Birkhoff polytope, and the permutation matrices play a special role in that polytope. The Birkhoff–von Neumann theorem says that every doubly stochastic real matrix is a convex combination of permutation matrices of the same order, with the permutation matrices being precisely the extreme points (the vertices) of the Birkhoff polytope. The Birkhoff polytope is thus the convex hull of the set of permutation matrices.^[4]

Linear-algebraic properties

Every permutation matrix P is $C_{\kappa }$ for a unique permutation $\kappa$ and is also $R_{\rho }$ for a unique permutation $\rho$ , where $\rho =\kappa ^{-1}$ . We can compute the linear-algebraic properties of P from some combinatorial properties that are shared by the permutations $\kappa$ and $\rho$ .

A point is fixed by $\kappa$ just when it is fixed by $\rho$ , and the trace of P is the number of fixed points. If the integer k is fixed, then the standard basis vector $e k$ is an eigenvector of P.

To calculate the eigenvalues of P, write the permutation $\kappa$ as a product of cycles, say, $\kappa =c_{1}c_{2}\cdots c_{t}$ . For $1\leq i\leq t$ , let the length of the cycle $c_{i}$ be $\ell _{i}$ , and let $L_{i}$ be the set of complex solutions of $x^{\ell _{i}}=1$ , those solutions being the $\ell _{i}^{\,{\text{th}}}$ roots of unity. The union of the $L_{i}$ is then the set of eigenvalues of P. Since writing $\rho$ as a product of cycles would give the same number of cycles of the same lengths, analyzing $\rho$ would have given us the same result. The geometric multiplicity of any eigenvalue v is the number of i for which $L_{i}$ contains v.^[5]

From group theory we know that any permutation may be written as a product of transpositions. Therefore, any permutation matrix factors as a product of row-switching elementary matrices, each of which has determinant −1. Thus, the determinant of the permutation matrix P is the sign of the permutaton $\kappa$ , which is also the sign of $\rho$ .

Restricted forms

Costas array, a permutation matrix in which the displacement vectors between the entries are all distinct
n-queens puzzle, a permutation matrix in which there is at most one entry in each diagonal and antidiagonal

References

^ Zavlanos, Michael M.; Pappas, George J. (November 2008). "A dynamical systems approach to weighted graph matching". Automatica. 44 (11): 2817–2824. doi:10.1016/j.automatica.2008.04.009. S2CID 834305. Retrieved 21 August 2022. In particular, since permutation matrices are orthogonal matrices with nonnegative elements, we define two gradient flows in the space of orthogonal matrices... Lemma 5: Let $O_{n}$ denote the set of $n\times n$ orthogonal matrices and $N_{n}$ denote the set of $n\times n$ element-wise non-negative matrices. Then, $P_{n}=O_{n}\cap N_{n}$ , where $P_{n}$ is the set of $n\times n$ permutation matrices.
^ Terminology is not standard. Most authors use just one of these correspondences, choosing which to be consistent with their other notation, so there is typically no need for two names.
^ Conway, John H.; Burgiel, Heidi; Goodman-Strauss, Chaim (2008). The Symmetries of Things. A K Peters. p. 179. A permutation---say, of the names of a number of people---can be thought of as moving either the names or the people. The alias viewpoint regards the permutation as assigning a new name or alias to each person (from the Latin alias = otherwise). Alternatively, from the alibi viewoint we move the people to the places corresponding to their new names (from the Latin alibi = in another place.)
^ Brualdi (2006) p.19
^ J Najnudel, A Nikeghbali 2010 p.4

Brualdi, Richard A. (2006). Combinatorial matrix classes. Encyclopedia of Mathematics and Its Applications. Vol. 108. Cambridge: Cambridge University Press. ISBN 0-521-86565-4. Zbl 1106.05001.
Joseph, Najnudel; Ashkan, Nikeghbali (2010), The Distribution of Eigenvalues of Randomized Permutation Matrices, arXiv:1005.0402, Bibcode:2010arXiv1005.0402N

[1] Zavlanos, Michael M.; Pappas, George J. (November 2008). "A dynamical systems approach to weighted graph matching". Automatica. 44 (11): 2817–2824. doi:10.1016/j.automatica.2008.04.009. S2CID 834305. Retrieved 21 August 2022. In particular, since permutation matrices are orthogonal matrices with nonnegative elements, we define two gradient flows in the space of orthogonal matrices... Lemma 5: Let $O_{n}$ denote the set of $n\times n$ orthogonal matrices and $N_{n}$ denote the set of $n\times n$ element-wise non-negative matrices. Then, $P_{n}=O_{n}\cap N_{n}$ , where $P_{n}$ is the set of $n\times n$ permutation matrices.

[2] Terminology is not standard. Most authors use just one of these correspondences, choosing which to be consistent with their other notation, so there is typically no need for two names.

[3] Conway, John H.; Burgiel, Heidi; Goodman-Strauss, Chaim (2008). The Symmetries of Things. A K Peters. p. 179. A permutation---say, of the names of a number of people---can be thought of as moving either the names or the people. The alias viewpoint regards the permutation as assigning a new name or alias to each person (from the Latin alias = otherwise). Alternatively, from the alibi viewoint we move the people to the places corresponding to their new names (from the Latin alibi = in another place.)

[Bru19-4] Brualdi (2006) p.19

[J_Najnudel2010_4-5] J Najnudel, A Nikeghbali 2010 p.4

[1]

[2]

[3]

[4]

[5]

v t e Matrix classes
Explicitly constrained entries	Alternant Anti-diagonal Anti-Hermitian Anti-symmetric Arrowhead Band Bidiagonal Bisymmetric Block-diagonal Block Block tridiagonal Boolean Cauchy Centrosymmetric Conference Complex Hadamard Copositive Diagonally dominant Diagonal Discrete Fourier Transform Elementary Equivalent Frobenius Generalized permutation Hadamard Hankel Hermitian Hessenberg Hollow Integer Logical Matrix unit Metzler Moore Nonnegative Pentadiagonal Permutation Persymmetric Polynomial Quaternionic Signature Skew-Hermitian Skew-symmetric Skyline Sparse Sylvester Symmetric Toeplitz Triangular Tridiagonal Vandermonde Walsh Z
Constant	Exchange Hilbert Identity Lehmer Of ones Pascal Pauli Redheffer Shift Zero
Conditions on eigenvalues or eigenvectors	Companion Convergent Defective Definite Diagonalizable Hurwitz-stable Positive-definite Stieltjes
Satisfying conditions on products or inverses	Congruent Idempotent or Projection Invertible Involutory Nilpotent Normal Orthogonal Unimodular Unipotent Unitary Totally unimodular Weighing
With specific applications	Adjugate Alternating sign Augmented Bézout Carleman Cartan Circulant Cofactor Commutation Confusion Coxeter Distance Duplication and elimination Euclidean distance Fundamental (linear differential equation) Generator Gram Hessian Householder Jacobian Moment Payoff Pick Random Rotation Routh-Hurwitz Seifert Shear Similarity Symplectic Totally positive Transformation
Used in statistics	Centering Correlation Covariance Design Doubly stochastic Fisher information Hat Precision Stochastic Transition
Used in graph theory	Adjacency Biadjacency Degree Edmonds Incidence Laplacian Seidel adjacency Tutte
Used in science and engineering	Cabibbo–Kobayashi–Maskawa Density Fundamental (computer vision) Fuzzy associative Gamma Gell-Mann Hamiltonian Irregular Overlap S State transition Substitution Z (chemistry)
Related terms	Jordan normal form Linear independence Matrix exponential Matrix representation of conic sections Perfect matrix Pseudoinverse Row echelon form Wronskian
Mathematics portal List of matrices Category:Matrices (mathematics)