Jump to content

User:Fawly/Computational complexity of matrix multiplication

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Fawly (talk | contribs) at 17:45, 21 May 2021 (refactoring). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Because matrix multiplication is such a central operation in many numerical algorithms, much work has been invested in making matrix multiplication algorithms efficient. Applications of matrix multiplication in computational problems are found in many fields including scientific computing and pattern recognition and in seemingly unrelated problems such as counting the paths through a graph.[1] Many different algorithms have been designed for multiplying matrices on different types of hardware, including parallel and distributed systems, where the computational work is spread over multiple processors (perhaps over a network).

Directly applying the mathematical definition of matrix multiplication gives an algorithm that takes time on the order of n3 field operations to multiply two n × n matrices over that field (Θ(n3) in big O notation). Better asymptotic bounds on the time required to multiply matrices have been known since the Strassen's algorithm in the 1960s, but it is still unknown what the optimal time is (i.e., what the complexity of the problem is). As of December 2020, the matrix multiplication algorithm with best asymptotic complexity runs in O(n2.3728596) time, given by Josh Alman and Virginia Vassilevska Williams,[2][3] however this algorithm is a galactic algorithm because of the large constants and cannot be realized practically.

Algorithms exist that provide better running times than the straightforward ones. The first to be discovered was Strassen's algorithm, devised by Volker Strassen in 1969 and often referred to as "fast matrix multiplication". It is based on a way of multiplying two 2 × 2-matrices which requires only 7 multiplications (instead of the usual 8), at the expense of several additional addition and subtraction operations. Applying this recursively gives an algorithm with a multiplicative cost of . Strassen's algorithm is more complex, and the numerical stability is reduced compared to the naïve algorithm,[4] but it is faster in cases where n > 100 or so[1] and appears in several libraries, such as BLAS.[5] It is very useful for large matrices over exact domains such as finite fields, where numerical stability is not an issue.

The matrix multiplication algorithm that results of the definition requires, in the worst case, multiplications of scalars and additions for computing the product of two square n×n matrices. Its computational complexity is therefore , in a model of computation for which the scalar operations require a constant time (in practice, this is the case for floating point numbers, but not for integers).

Rather surprisingly, this complexity is not optimal, as shown in 1969 by Volker Strassen, who provided an algorithm, now called Strassen's algorithm, with a complexity of [6] The exponent appearing in the complexity of matrix multiplication has been improved several times,[7][8][9][10][11][12] leading to the Coppersmith–Winograd algorithm with a complexity of O(n2.3755) (1990).[13][14] This algorithm has been slightly improved in 2010 by Stothers to a complexity of O(n2.3737),[15] in 2013 by Virginia Vassilevska Williams to O(n2.3729),[14][16] and in 2014 by François Le Gall to O(n2.3728639).[17] This was further refined in 2020 by Josh Alman and Virginia Vassilevska Williams to a final (up to date) complexity of O(n2.3728596).[18]

The greatest lower bound for the exponent of matrix multiplication algorithm is generally called . It is trivially true that , because one must read the elements of a matrix in order to multiply it with another matrix. Thus . It is unknown whether . The largest known lower bound for matrix-multiplication complexity is Ω(n2 log(n)), for a restricted kind of arithmetic circuits, and is due to Ran Raz.[19]

Improvement of estimates of exponent ω over time for the computational complexity of matrix multiplication .

Freivalds' algorithm is a simple Monte Carlo algorithm that, given matrices A, B and C, verifies in Θ(n2) time if AB = C.

Matrix multiplication exponent

It is an open question in theoretical computer science how well Strassen's algorithm can be improved in terms of asymptotic complexity. The matrix multiplication exponent, usually denoted , is the smallest real number for which any matrix over a field can be multiplied together using field operations. The current best bound on is , by Josh Alman and Virginia Vassilevska Williams.[2] This algorithm, like all other recent algorithms in this line of research, is a generalization of the Coppersmith–Winograd algorithm, which was given by Don Coppersmith and Shmuel Winograd in 1990, was the best matrix multiplication algorithm until 2010, and has an asymptotic complexity of O(n2.375477).[20] The conceptual idea of these algorithms are similar to Strassen's algorithm: a way is devised for multiplying two k × k-matrices with fewer than k3 multiplications, and this technique is applied recursively. However, the constant coefficient hidden by the Big O notation is so large that these algorithms are only worthwhile for matrices that are too large to handle on present-day computers.[21][22]

Since any algorithm for multiplying two n × n-matrices has to process all 2n2 entries, there is an asymptotic lower bound of Ω(n2) operations. Raz proved a lower bound of Ω(n2 log(n)) for bounded coefficient arithmetic circuits over the real or complex numbers.[23]

Group theory reformulation of matrix multiplication algorithms

Henry Cohn, Robert Kleinberg, Balázs Szegedy and Chris Umans put methods such as the Strassen and Coppersmith–Winograd algorithms in an entirely different group-theoretic context, by utilising triples of subsets of finite groups which satisfy a disjointness property called the triple product property (TPP). They also give conjectures that, if true, would imply that there are matrix multiplication algorithms with essentially quadratic complexity. This implies that the optimal exponent of matrix multiplication is 2, which most researchers believe is indeed the case.[22] One such conjecture is that families of wreath products of Abelian groups with symmetric groups realise families of subset triples with a simultaneous version of the TPP.[24][25] Several of their conjectures have since been disproven by Blasiak, Cohn, Church, Grochow, Naslund, Sawin, and Umans using the Slice Rank method.[26] Further, Alon, Shpilka and Chris Umans have recently shown that some of these conjectures implying fast matrix multiplication are incompatible with another plausible conjecture, the sunflower conjecture.[27]

The importance of the computational complexity of matrix multiplication relies on the facts that many algorithmic problems may be solved by means of matrix computation, and most problems on matrices have a complexity which is either the same as that of matrix multiplication (up to a multiplicative constant), or may be expressed in term of the complexity of matrix multiplication or its exponent

There are several advantages of expressing complexities in terms of the exponent of matrix multiplication. Firstly, if is improved, this will automatically improve the known upper bound of complexity of many algorithms. Secondly, in practical implementations, one never uses the matrix multiplication algorithm that has the best asymptotical complexity, because the constant hidden behind the big O notation is too large for making the algorithm competitive for sizes of matrices that can be manipulated in a computer.[citation needed] Thus expressing complexities in terms of provide a more realistic complexity, since it remains valid whichever algorithm is chosen for matrix computation.

Problems that have the same asymptotic complexity as matrix multiplication include determinant, matrix inversion, Gaussian elimination (see next section). Problems with complexity that is expressible in terms of include characteristic polynomial, eigenvalues (but not eigenvectors), Hermite normal form, and Smith normal form.[citation needed]

Matrix inversion, determinant and Gaussian elimination

In his 1969 paper, where he proved the complexity for matrix computation, Strassen proved also that matrix inversion, determinant and Gaussian elimination have, up to a multiplicative constant, the same computational complexity as matrix multiplication. The proof does not make any assumptions on matrix multiplication that is used, except that its complexity is for some

The starting point of Strassen's proof is using block matrix multiplication. Specifically, a matrix of even dimension 2n×2n may be partitioned in four n×n blocks

Under this form, its inverse is

provided that A and are invertible.

Thus, the inverse of a 2n×2n matrix may be computed with two inversions, six multiplications and four additions or additive inverses of n×n matrices. It follows that, denoting respectively by I(n), M(n) and A(n) = n2 the number of operations needed for inverting, multiplying and adding n×n matrices, one has

If one may apply this formula recursively:

If and one gets eventually

for some constant d.

For matrices whose dimension is not a power of two, the same complexity is reached by increasing the dimension of the matrix to a power of two, by padding the matrix with rows and columns whose entries are 1 on the diagonal and 0 elsewhere.

This proves the asserted complexity for matrices such that all submatrices that have to be inverted are indeed invertible. This complexity is thus proved for almost all matrices, as a matrix with randomly chosen entries is invertible with probability one.

The same argument applies to LU decomposition, as, if the matrix A is invertible, the equality

defines a block LU decomposition that may be applied recursively to and for getting eventually a true LU decomposition of the original matrix.

The argument applies also for the determinant, since it results from the block LU decomposition that

See also

References

  1. ^ a b Cite error: The named reference skiena was invoked but never defined (see the help page).
  2. ^ a b Alman, Josh; Williams, Virginia Vassilevska (2020), "A Refined Laser Method and Faster Matrix Multiplication", 32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2021), arXiv:2010.05846
  3. ^ Hartnett, Kevin. "Matrix Multiplication Inches Closer to Mythic Goal". Quanta Magazine. Retrieved 2021-04-01.
  4. ^ Miller, Webb (1975), "Computational complexity and numerical stability", SIAM News, 4 (2): 97–107, CiteSeerX 10.1.1.148.9947, doi:10.1137/0204009
  5. ^ Press, William H.; Flannery, Brian P.; Teukolsky, Saul A.; Vetterling, William T. (2007). Numerical Recipes: The Art of Scientific Computing (3rd ed.). Cambridge University Press. p. 108. ISBN 978-0-521-88068-8.
  6. ^ Volker Strassen (Aug 1969). "Gaussian elimination is not optimal". Numerische Mathematik. 13 (4): 354–356. doi:10.1007/BF02165411. S2CID 121656251.
  7. ^ Victor Yakovlevich Pan (Oct 1978). "Strassen's Algorithm is not Optimal: Trilinear Technique of Aggregating, Uniting and Canceling for Constructing Fast Algorithms for Matrix Operations". Proc. 19th FOCS. pp. 166–176. doi:10.1109/SFCS.1978.34. S2CID 14348408.
  8. ^ Dario Andrea Bini; Milvio Capovani; Francesco Romani; Grazia Lotti (Jun 1979). " complexity for approximate matrix multiplication". Information Processing Letters. 8 (5): 234–235. doi:10.1016/0020-0190(79)90113-3.
  9. ^ A. Schönhage (1981). "Partial and total matrix multiplication". SIAM Journal on Computing. 10 (3): 434–455. doi:10.1137/0210032.
  10. ^ Francesco Romani (1982). "Some properties of disjoint sums of tensors related to matrix multiplication". SIAM Journal on Computing. 11 (2): 263–267. doi:10.1137/0211020.
  11. ^ D. Coppersmith; S. Winograd (1981). "On the asymptotic complexity of matrix multiplication". Proc. 22nd Annual Symposium on Foundations of Computer Science (FOCS). pp. 82–90. doi:10.1109/SFCS.1981.27. S2CID 206558664.
  12. ^ Volker Strassen (Oct 1986). "The asymptotic spectrum of tensors and the exponent of matrix multiplication". Proc. 27th Ann. Symp. on Foundation of Computer Science (FOCS). pp. 49–54. doi:10.1109/SFCS.1986.52. S2CID 15077423.
  13. ^ D. Coppersmith; S. Winograd (Mar 1990). "Matrix multiplication via arithmetic progressions". Journal of Symbolic Computation. 9 (3): 251–280. doi:10.1016/S0747-7171(08)80013-2.
  14. ^ a b Williams, Virginia Vassilevska. Multiplying matrices in time (PDF) (Technical Report). Stanford University.
  15. ^ Stothers, Andrew James (2010). On the complexity of matrix multiplication (Ph.D. thesis). University of Edinburgh.
  16. ^ Virginia Vassilevska Williams (2012). "Multiplying Matrices Faster than Coppersmith-Winograd". In Howard J. Karloff; Toniann Pitassi (eds.). Proc. 44th Symposium on Theory of Computing (STOC). ACM. pp. 887–898. doi:10.1145/2213977.2214056.
  17. ^ Le Gall, François (2014), "Powers of tensors and fast matrix multiplication", in Katsusuke Nabeshima (ed.), Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation (ISSAC), pp. 296–303, arXiv:1401.7714, Bibcode:2014arXiv1401.7714L, ISBN 978-1-4503-2501-1
  18. ^ Alman, Josh; Williams, Virginia Vassilevska (2020), "A Refined Laser Method and Faster Matrix Multiplication", 32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2021), arXiv:2010.05846
  19. ^ Raz, Ran (January 2003). "On the Complexity of Matrix Product". SIAM Journal on Computing. 32 (5): 1356–1369. doi:10.1137/s0097539702402147. ISSN 0097-5397.
  20. ^ Coppersmith, Don; Winograd, Shmuel (1990), "Matrix multiplication via arithmetic progressions" (PDF), Journal of Symbolic Computation, 9 (3): 251, doi:10.1016/S0747-7171(08)80013-2
  21. ^ Iliopoulos, Costas S. (1989), "Worst-case complexity bounds on algorithms for computing the canonical structure of finite abelian groups and the Hermite and Smith normal forms of an integer matrix" (PDF), SIAM Journal on Computing, 18 (4): 658–669, CiteSeerX 10.1.1.531.9309, doi:10.1137/0218045, MR 1004789, archived from the original (PDF) on 2014-03-05, retrieved 2015-01-16, The Coppersmith–Winograd algorithm is not practical, due to the very large hidden constant in the upper bound on the number of multiplications required.
  22. ^ a b Robinson, Sara (November 2005), "Toward an Optimal Algorithm for Matrix Multiplication" (PDF), SIAM News, 38 (9), Even if someone manages to prove one of the conjectures—thereby demonstrating that ω = 2—the wreath product approach is unlikely to be applicable to the large matrix problems that arise in practice. [...] the input matrices must be astronomically large for the difference in time to be apparent.
  23. ^ Raz, Ran (2002). "On the complexity of matrix product". Proceedings of the Thirty-fourth Annual ACM Symposium on Theory of Computing: 144. doi:10.1145/509907.509932. ISBN 1581134959. S2CID 9582328.
  24. ^ Cohn, H.; Kleinberg, R.; Szegedy, B.; Umans, C. (2005). "Group-theoretic Algorithms for Matrix Multiplication". 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05). p. 379. doi:10.1109/SFCS.2005.39. ISBN 0-7695-2468-0. S2CID 41278294.
  25. ^ Henry Cohn, Chris Umans. A Group-theoretic Approach to Fast Matrix Multiplication. arXiv:math.GR/0307321. Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 11–14 October 2003, Cambridge, MA, IEEE Computer Society, pp. 438–449.
  26. ^ Blasiak, J.; Cohn, H.; Church, T.; Grochow, J.; Naslund, E.; Sawin, W.; Umans, C. (2017). "On cap sets and the group-theoretic approach to matrix multiplication". Discrete Analysis. doi:10.19086/da.1245. S2CID 9687868.
  27. ^ Alon, Shpilka, Umans, On Sunflowers and Matrix Multiplication