Jump to content

Worst-case optimal join algorithm

From Wikipedia, the free encyclopedia
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
An illustration of properties of join algorithms. When performing a join between more than two relations on more than two attributes, binary join algorithms such as hash join operate over two relations at a time, and join them on all attributes in the join condition; worst-case optimal algorithms such as generic join operate on a single attribute at a time but join all the relations on this attribute.[1]

A worst-case optimal join algorithm is an algorithm for computing relational joins with a runtime that is bounded by the worst-case output size of the join. Traditional binary join algorithms such as hash join operate over two relations at a time; joins between more than two relations are implemented by repeatedly applying binary joins. Worst-case optimal join algorithms are asymptotically faster in worst case than any join algorithm based on such iterated binary joins.

The first worst-case optimal join algorithm, generic join, was published in 2012.[2] Worst-case optimal join algorithms have been implemented in commercial database systems, including the LogicBlox system.[3][4] Worst-case optimal joins have been applied to build a worst-case optimal algorithm for e-matching.[5]

References

Notes

  1. ^ Wang, Yisu Remy; Willsey, Max; Suciu, Dan (2023-01-27). "Free Join: Unifying Worst-Case Optimal and Traditional Joins". arXiv:2301.10841 [cs.DB].
  2. ^ Ngo, Hung Q.; Porat, Ely; Ré, Christopher; Rudra, Atri (2012-03-08). "Worst-case Optimal Join Algorithms". arXiv:1203.1952 [cs.DB].
  3. ^ Veldhuizen, Todd L. (2013-12-20). "Leapfrog Triejoin: a worst-case optimal join algorithm". arXiv:1210.0481 [cs.DB].
  4. ^ Freitag, Michael; Bandle, Maximilian; Schmidt, Tobias; Kemper, Alfons; Neumann, Thomas (2020-07-01). "Adopting worst-case optimal joins in relational database systems". Proceedings of the VLDB Endowment. 13 (12): 1891–1904. doi:10.14778/3407790.3407797. ISSN 2150-8097. S2CID 221115321.
  5. ^ Zhang, Yihong; Wang, Yisu Remy; Willsey, Max; Tatlock, Zachary (2022-01-12). "Relational e-matching". Proceedings of the ACM on Programming Languages. 6 (POPL): 35:1–35:22. doi:10.1145/3498696. S2CID 236924583.

Sources