Commutation matrix

In mathematics, especially in linear algebra and matrix theory, the commutation matrix is used for transforming the vectorized form of a matrix into the vectorized form of its transpose. Specifically, the commutation matrix K^(m,n) is the nm × mn matrix which, for any m × n matrix A, transforms vec(A) into vec(A^T):

K^(m,n) vec(A) = vec(A^T) .

Here vec(A) is the mn × 1 column vector obtain by stacking the columns of A on top of one another:

\operatorname {vec} (\mathbf {A} )=[\mathbf {A} _{1,1},\ldots ,\mathbf {A} _{m,1},\mathbf {A} _{1,2},\ldots ,\mathbf {A} _{m,2},\ldots ,\mathbf {A} _{1,n},\ldots ,\mathbf {A} _{m,n}]^{\mathrm {T} }

where A = [A_i,j].

In the context of quantum information theory, the commutation matrix is sometimes referred to as the swap matrix or swap operator ^[1]

Properties

The commutation matrix is a special type of permutation matrix, and is therefore orthogonal.

Replacing A with A^T in the definition of the commutation matrix shows that K^(m,n) = (K^(n,m))^T. Therefore in the special case of m = n the commutation matrix is an involution and symmetric.

The main use of the commutation matrix, and the source of its name, is to commute the Kronecker product: for every m × n matrix A and every r × q matrix B,

\mathbf {K} ^{(r,m)}(\mathbf {A} \otimes \mathbf {B} )\mathbf {K} ^{(n,q)}=\mathbf {B} \otimes \mathbf {A} .

This property is often used in developing the higher order statistics of Wishart covariance matrices.^[2]

The case of n=q=1 for the above equation states that for any column vectors v,w of sizes m,r respectively,

\mathbf {K} ^{(r,m)}(\mathbf {v} \otimes \mathbf {w} )=\mathbf {w} \otimes \mathbf {v} .

This property is the reason that this matrix is referred to as the "swap operator" in the context of quantum information theory.

An explicit form for the commutation matrix is as follows: if e_r,j denotes the j-th canonical vector of dimension r (i.e. the vector with 1 in the j-th coordinate and 0 elsewhere) then

\mathbf {K} ^{(r,m)}=\sum _{i=1}^{r}\sum _{j=1}^{m}\left(\mathbf {e} _{r,i}{\mathbf {e} _{m,j}}^{\mathrm {T} }\right)\otimes \left(\mathbf {e} _{m,j}{\mathbf {e} _{r,i}}^{\mathrm {T} }\right).

The commutation matrix may be expressed as a the following block matrix:

\mathbf {K} ^{(m,n)}={\begin{bmatrix}\mathbf {K} _{1,1}&\cdots &\mathbf {K} _{1,n}\\\vdots &\ddots &\vdots \\\mathbf {K} _{m,1}&\cdots &\mathbf {K} _{m,n},\end{bmatrix}},

Where the p,q entry of n x m block-matrix K_i,j is given by

\mathbf {K} _{ij}(p,q)={\begin{cases}1&i=q{\text{ and }}j=p\\0&{\text{otherwise}}.\end{cases}}

For example,

\mathbf {K} ^{(3,4)}=\left[{\begin{array}{ccc|ccc|ccc|ccc}1&0&0&0&0&0&0&0&0&0&0&0\\0&0&0&1&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&1&0&0&0&0&0\\0&0&0&0&0&0&0&0&0&1&0&0\\\hline 0&1&0&0&0&0&0&0&0&0&0&0\\0&0&0&0&1&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&1&0&0&0&0\\0&0&0&0&0&0&0&0&0&0&1&0\\\hline 0&0&1&0&0&0&0&0&0&0&0&0\\0&0&0&0&0&1&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&1&0&0&0\\0&0&0&0&0&0&0&0&0&0&0&1\end{array}}\right].

Code

For both square and rectangular matrices of m rows and n columns, the commutation matrix can be generated by the code below.

Python

import numpy as np

def comm_mat(m,n):
    # determine permutation applied by K
    v = np.arange(m*n)
    A = np.reshape(v,(m,n),order = 'F')
    w = np.reshape(A.T,m*n, order = 'F')

    # apply this permutation to the rows (i.e. to each column) of K
    M = np.eye(m*n,dtype='int')
    M = M[w,:]
    return M

Alternatively, a version without imports:

# Kronecker delta
def delta(i,j): 
    return int(i==j)

def comm_mat(m,n):
    # determine permutation applied by K
    v = [m*j + i for i in range(m) for j in range(n)]

    # apply this permutation to the rows (i.e. to each column) of K
    M = [[delta(i,j) for j in range(m*n)] for i in range(m*n)]
    M = [M[i] for i in v]
    return M

Example

Let M be a 2×2 square matrix.

M={\begin{bmatrix}a&b\\c&d\\\end{bmatrix}}

Then we have

\operatorname {vec} (M)={\begin{bmatrix}a\\c\\b\\d\\\end{bmatrix}}

And K^(2,2) is the 4×4 square matrix that will transform vec(M) into vec(M^T)

{\begin{bmatrix}1&0&0&0\\0&0&1&0\\0&1&0&0\\0&0&0&1\\\end{bmatrix}}{\begin{bmatrix}a\\c\\b\\d\\\end{bmatrix}}={\begin{bmatrix}a\\b\\c\\d\\\end{bmatrix}}=\operatorname {vec} (M^{\mathrm {T} })

The $3\times 2$ matrix $A$ has two possible vectorizations as follows:

A={\begin{bmatrix}1&4\\2&5\\3&6\\\end{bmatrix}},\quad V_{1}=\operatorname {vec} (A)={\begin{bmatrix}1\\2\\3\\4\\5\\6\\\end{bmatrix}},\quad V_{2}=\operatorname {vec} (A^{\mathrm {T} })={\begin{bmatrix}1\\4\\2\\5\\3\\6\\\end{bmatrix}}

and the code above yields

K=\mathbf {K} ^{(3,2)}={\begin{bmatrix}1&\cdot &\cdot &\cdot &\cdot &\cdot \\\cdot &\cdot &1&\cdot &\cdot &\cdot \\\cdot &\cdot &\cdot &\cdot &1&\cdot \\\cdot &1&\cdot &\cdot &\cdot &\cdot \\\cdot &\cdot &\cdot &1&\cdot &\cdot \\\cdot &\cdot &\cdot &\cdot &\cdot &1\\\end{bmatrix}}

giving the expected results

K^{\mathrm {T} }K=KK^{\mathrm {T} }=\mathbf {I} _{6}

K^{\mathrm {T} }V_{1}=V_{2}

KV_{2}=V_{1}

References

^ Watrous, John (2018). The Theory of Quantum Information. Cambridge Universtiy Press. p. 94.
^ von Rosen, Dietrich (1988). "Moments for the Inverted Wishart Distribution". Scand. J. Stat. 15: 97–109.

Jan R. Magnus and Heinz Neudecker (1988), Matrix Differential Calculus with Applications in Statistics and Econometrics, Wiley.

[1] Watrous, John (2018). The Theory of Quantum Information. Cambridge Universtiy Press. p. 94.

[2] von Rosen, Dietrich (1988). "Moments for the Inverted Wishart Distribution". Scand. J. Stat. 15: 97–109.

[1]

[2]

v t e Matrix classes
Explicitly constrained entries	Alternant Anti-diagonal Anti-Hermitian Anti-symmetric Arrowhead Band Bidiagonal Bisymmetric Block-diagonal Block Block tridiagonal Boolean Cauchy Centrosymmetric Conference Complex Hadamard Copositive Diagonally dominant Diagonal Discrete Fourier Transform Elementary Equivalent Frobenius Generalized permutation Hadamard Hankel Hermitian Hessenberg Hollow Integer Logical Matrix unit Metzler Moore Nonnegative Pentadiagonal Permutation Persymmetric Polynomial Quaternionic Signature Skew-Hermitian Skew-symmetric Skyline Sparse Sylvester Symmetric Toeplitz Triangular Tridiagonal Vandermonde Walsh Z
Constant	Exchange Hilbert Identity Lehmer Of ones Pascal Pauli Redheffer Shift Zero
Conditions on eigenvalues or eigenvectors	Companion Convergent Defective Definite Diagonalizable Hurwitz-stable Positive-definite Stieltjes
Satisfying conditions on products or inverses	Congruent Idempotent or Projection Invertible Involutory Nilpotent Normal Orthogonal Unimodular Unipotent Unitary Totally unimodular Weighing
With specific applications	Adjugate Alternating sign Augmented Bézout Carleman Cartan Circulant Cofactor Commutation Confusion Coxeter Distance Duplication and elimination Euclidean distance Fundamental (linear differential equation) Generator Gram Hessian Householder Jacobian Moment Payoff Pick Random Rotation Routh-Hurwitz Seifert Shear Similarity Symplectic Totally positive Transformation
Used in statistics	Centering Correlation Covariance Design Doubly stochastic Fisher information Hat Precision Stochastic Transition
Used in graph theory	Adjacency Biadjacency Degree Edmonds Incidence Laplacian Seidel adjacency Tutte
Used in science and engineering	Cabibbo–Kobayashi–Maskawa Density Fundamental (computer vision) Fuzzy associative Gamma Gell-Mann Hamiltonian Irregular Overlap S State transition Substitution Z (chemistry)
Related terms	Jordan normal form Linear independence Matrix exponential Matrix representation of conic sections Perfect matrix Pseudoinverse Row echelon form Wronskian
Mathematics portal List of matrices Category:Matrices (mathematics)