Soft heap
![]() | This article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations. (April 2023) |
In computer science, a soft heap is a variant on the simple heap data structure that has constant amortized time complexity for 5 types of operations. This is achieved by carefully "corrupting" (increasing) the keys of at most a constant number of values in the heap.
Applications
Despite their limitations and unpredictable nature, soft heaps are useful in the design of deterministic algorithms. They were used to achieve the best complexity to date for finding a minimum spanning tree. They can also be used to easily build an optimal selection algorithm, as well as near-sorting algorithms, which are algorithms that place every element near its final position, a situation in which insertion sort is fast.
One of the simplest examples is the selection algorithm. Say we want to find the kth largest of a group of n numbers. First, we choose an error rate of 1/3; that is, at most about 33% of the keys we insert will be corrupted. Now, we insert all n elements into the heap — we call the original values the "correct" keys, and the values stored in the heap the "stored" keys. At this point, at most n/3 keys are corrupted, that is, for at most n/3 keys is the "stored" key larger than the "correct" key, for all the others the stored key equals the correct key.
Next, we delete the minimum element from the heap n/3 times (this is done according to the "stored" key). As the total number of insertions we have made so far is still n, there are still at most n/3 corrupted keys in the heap. Accordingly, at least 2n/3 − n/3 = n/3 of the keys remaining in the heap are not corrupted.
Let L be the element with the largest correct key among the elements we removed. The stored key of L is possibly larger than its correct key (if L was corrupted), and even this larger value is smaller than all the stored keys of the remaining elements in the heap (as we were removing minimums). Therefore, the correct key of L is smaller than the remaining n/3 uncorrupted elements in the soft heap. Thus, L divides the elements somewhere between 33%/66% and 66%/33%. We then partition the set about L using the partition algorithm from quicksort and apply the same algorithm again to either the set of numbers less than L or the set of numbers greater than L, neither of which can exceed 2n/3 elements. Since each insertion and deletion requires O(1) amortized time, the total deterministic time is T(n) = T(2n/3) + O(n). Using case 3 of the master theorem for divide-and-conquer recurrences (with ε=1 and c=2/3), we know that T(n) = Θ(n).
The final algorithm looks like this:
function softHeapSelect(a[1..n], k) if k = 1 then return minimum(a[1..n]) create(S) for i from 1 to n insert(S, a[i]) for i from 1 to n/3 x := findmin(S) delete(S, x) xIndex := partition(a, x) // Returns new index of pivot x if k < xIndex softHeapSelect(a[1..xIndex-1], k) else softHeapSelect(a[xIndex..n], k-xIndex+1)
References
- Chazelle, Bernard (November 2000). "The soft heap: an approximate priority queue with optimal error rate" (PDF). J. ACM. 47 (6): 1012–1027. CiteSeerX 10.1.1.5.9705. doi:10.1145/355541.355554. S2CID 12556140.
- Kaplan, Haim; Zwick, Uri (2009). "A simpler implementation and analysis of Chazelle's soft heaps". Proceedings of the Nineteenth Annual ACM–SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics. pp. 477–485. CiteSeerX 10.1.1.215.6250. doi:10.1137/1.9781611973068.53. ISBN 978-0-89871-680-1.