Weak heap

A weak heap is a variation of the a binary heap data structure which allows a more efficient implementation of priority queues and sorting than standard binary heaps. More specifically, it requires fewer comparisons than most other algorithms, so is helpful when comparison is expensive (such as when comparing strings using the full Unicode collation algorithm).

Description

A weak max-heap on a set of n values is defined to be a binary tree with n nodes, one for each value, satisfying the following constraints:^[1]

The root node has no left child
For every node, the value associated with that node is greater or equal to than the values associated with all nodes in its right subtree.
The leaves of the tree have heights that are all within one of each other.

These invariants are similar to a standard binary heap, but weaker because a node is not kept sorted with respect to its left subtree, and the leaves on the bottom level are not required to be left-aligned in a complete binary tree.

(A weak min-heap reverses the conditions in the obvious way.)

The structure of a weak heap maps very neatly onto the 1-based (Ahnentafel) implicit binary tree arrangement, where the children of node k are numbered 2k and 2k+1, by adding an additional node 0. This has only a right child, which is node 1 (2×0+1).

Weak heaps require the ability to exchange the left and right children (and associated subtrees) of a node. In an explicit (pointer-based) representation of the tree, this is straightforward. In an implicit (array) representation, this requires one bit per internal node to indicate which child is considered the left child. Thus, a weak heap is not a strictly implicit data structure since it requires O(n) additional space (1⁄2 bit per node). However, it is often possible to find space for this extra bit within the node structure, such as by tagging a pointer which is already present.

An equivalent model of a weak heap is that it is a root node whose right child is a complete binary tree. Within that tree, each node is ordered with respect to one of its subtrees, and the tag bit indicates which.

A weak heap is very similar to a binomial heap stored in a right-child left-sibling binary tree (as opposed to the usual left-child right-sibling order). A perfect (no missing leaves) weak heap with 2ⁿ elements is exactly isomorphic to a binomial heap of the same size, but the two algorithms handle sizes which are not a power of 2 differently.

Operations on weak heaps

Note that every node in a weak heap can be considered the root of a smaller weak heap by ignoring its left subtree. That includes nodes with no children, which are automatically valid weak heaps.

Each node in a weak heap except the root has a distinguished ancestor. This is the nearest ancestor with which it has an order relationship. If the node is a right child, the the distinguished ancestor is its parent. If the node is a right child, its distinguished ancestor is the same as its parent's. (In the early papers, the distinguished ancestor was called the "grandparent", a meaning very different from the usual "parent of parent".)

If the binary tree is interpreted as a general tree stored using the "right-child left-sibling" convention, then the distinguish ancestor is simply the parent.

Although the distinguished ancestor may be arbitrarily high in the tree, the average distance is 2. (It's at least 1, and half of the time we recurse, so $D = 1 + D /2$ , meaning that $D = 2$ .) Thus, even a simple iterative algorithm for finding the distinguished ancestor is efficient.

A given node is the distinguished ancestor of all nodes on the left spine of its right subtree. That is, its right child, and all of its right child's left-only descendants. Each of these can be considered (by ignoring its left subtree) to be the root of a weak heap of different height. The root can be removed by merging all of these weak heaps.

The fundamental operation on weak heaps is merging two heaps of equal height $h$ , to make a weak heap of height $h +1$ . This requires exactly one comparison, between the roots. Whichever root is greater (assuming a max-heap) is the final root. Its right child is the other root, which retains its right subtree. The final root's right subtree is installed as the other root's left subtree.

This operation can be performed on the implicit tree structure because the heaps being merged are never arbitrary. Rather, the two heaps are formed as follows:

One is a normal weak heap (whose left subtree exists, but is ignored).
The other is a "floating" root, whose right subtree is deemed to be the first heap's left subtree. This does not follow the usual implicit heap rules, but the implementation behaves as if the two were linked anyway.

After comparing the two, the merge proceeds in one of two ways:

(The floating root is greater.) The floating root becomes the parent of the normal root, and nothing else needs changing, since the left subtree is already in place, or
(The normal root is greater.) The normal root's children are swapped (using the reverse bit), and then the floating root and the normal root are swapped (by copying).

At the end, the floating root holds the root of the new combined tree, and the normal root holds the root of its right subtree.

Weak-heap sort

Weak heaps may be used to sort an array, in essentially the same way as a conventional heapsort.^[2] First, a weak heap is built out of all of the elements of the array, and then the root is repeatedly exchanged with the last element, which is sifted down to its proper place.

A weak heap of $n$ elements can be formed in $n -1$ merges. It can be done on various orders, but a simple bottom-up implementation works from the end of the array to the beginning, merging each node (as the standard root) with its distinguished ancestor (as the floating root). Note that finding the distinguished ancestor is simplified because the reverse bits in all parents of the root being merged are unmodified from their initial state ("not reversed").

The repeated removal is accomplished by first swapping the root with the last element of the heap. Then find the leftmost child of the main heap tree. Taking this as the normal heap, and the (now out of order) root as floating root, merge the two height-1 heaps into a height-2 heap. Then go to the parent of the normal heap, and merge again. Repeat until the root is finalized.

Priority queue operations

In a weak max-heap, the maximum value can be found (in constant time) as the value associated with the root node; similarly, in a weak min-heap, the minimum value can be found at the root.

As with binary heaps, weak heaps can support the typical operations of a priority queue data structure: insert, delete-min, delete, or decrease-key, in logarithmic time per operation. Variants of the weak heap structure allow constant amortized time insertions and decrease-keys, matching the time for Fibonacci heaps.^[3]

History and applications

Weak heaps were introduced by Dutton (1993), as part of a variant heap sort algorithm that (unlike the standard heap sort using binary heaps) could be used to sort n items using only n log₂n + O(n) comparisons.^[2]^[4] They were later investigated as a more generally applicable priority queue data structure.^[5]^[6]

References

^ Edelkamp, Stefan (26 May 2011), Pieterse, Vreda; Black, Paul E. (eds.), "weak-heap", Dictionary of Algorithms and Data Structures, retrieved 2015-12-01
^ ^a ^b Dutton, Ronald D. (1993), "Weak-heap sort", BIT, 33 (3): 372–381, doi:10.1007/bf01990520.
^ Edelkamp, Stefan; Elmasry, Amr; Katajainen, Jyrki (October 2012), "The weak-heap data structure: variants and applications" (PDF), Journal of Discrete Algorithms, 16: 187–205, doi:10.1016/j.jda.2012.04.010, MR 2960353.
^ Edelkamp, Stefan; Wegener, Ingo (2000), "On the Performance of WEAK-HEAPSORT", Proceedings of the Symposium on Theoretical Aspects of Computer Science (STACS 2000) (PDF), Lecture Notes in Computer Science, vol. 1770, Springer-Verlag, pp. 254–266, CiteSeerX 10.1.1.21.1863, doi:10.1007/3-540-46541-3_21.
^ Bruun, Asger; Edelkamp, Stefan; Katajainen, Jyrki; Rasmussen, Jens (2010), "Policy-Based Benchmarking of Weak Heaps and Their Relatives", Proceedings of the 9th International Symposium on Experimental Algorithms (SEA 2010), Lecture Notes in Computer Science, vol. 6049, Springer-Verlag, pp. 424–435, doi:10.1007/978-3-642-13193-6_36 {{citation}}: External link in |lay-url= (help); Unknown parameter |lay-url= ignored (help).
^ Edelkamp, Stefan; Elmasry, Amr; Katajainen, Jyrki (2012), "The Weak-heap Family of Priority Queues in Theory and Praxis", Proceedings of the Eighteenth Computing: The Australasian Theory Symposium (CATS 2012), Volume 128 (PDF), Darlinghurst, Australia, Australia: Australian Computer Society, Inc., pp. 103–112, ISBN 978-1-921770-09-8.

[1] Edelkamp, Stefan (26 May 2011), Pieterse, Vreda; Black, Paul E. (eds.), "weak-heap", Dictionary of Algorithms and Data Structures, retrieved 2015-12-01

[Dutton93-2] Dutton, Ronald D. (1993), "Weak-heap sort", BIT, 33 (3): 372–381, doi:10.1007/bf01990520.

[3] Edelkamp, Stefan; Elmasry, Amr; Katajainen, Jyrki (October 2012), "The weak-heap data structure: variants and applications" (PDF), Journal of Discrete Algorithms, 16: 187–205, doi:10.1016/j.jda.2012.04.010, MR 2960353.

[4] Edelkamp, Stefan; Wegener, Ingo (2000), "On the Performance of WEAK-HEAPSORT", Proceedings of the Symposium on Theoretical Aspects of Computer Science (STACS 2000) (PDF), Lecture Notes in Computer Science, vol. 1770, Springer-Verlag, pp. 254–266, CiteSeerX 10.1.1.21.1863, doi:10.1007/3-540-46541-3_21.

[5] Bruun, Asger; Edelkamp, Stefan; Katajainen, Jyrki; Rasmussen, Jens (2010), "Policy-Based Benchmarking of Weak Heaps and Their Relatives", Proceedings of the 9th International Symposium on Experimental Algorithms (SEA 2010), Lecture Notes in Computer Science, vol. 6049, Springer-Verlag, pp. 424–435, doi:10.1007/978-3-642-13193-6_36 {{citation}}: External link in |lay-url= (help); Unknown parameter |lay-url= ignored (help).

[6] Edelkamp, Stefan; Elmasry, Amr; Katajainen, Jyrki (2012), "The Weak-heap Family of Priority Queues in Theory and Praxis", Proceedings of the Eighteenth Computing: The Australasian Theory Symposium (CATS 2012), Volume 128 (PDF), Darlinghurst, Australia, Australia: Australian Computer Society, Inc., pp. 103–112, ISBN 978-1-921770-09-8.

[1]

[2]

[3]

[4]

[5]

[6]