Jump to content

Multi-key quicksort

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Qwertyus (talk | contribs) at 11:56, 9 April 2015 (Created page with ''''Three-way radix quicksort''', also known as '''multi-key quicksort''',<ref>{{DADS|multikey Quicksort|multikeyQuicksort}}</ref> is an algorithm for sorti...'). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Three-way radix quicksort, also known as multi-key quicksort,[1] is an algorithm for sorting strings. This hybrid of quicksort and radix sort was originally suggested by P. Shackleton and reported in C.A.R. Hoare's seminal paper on quicksort; its modern incarnation was developed by Jon Bentley and Robert Sedgewick in the mid-1990s.[2] The algorithm is designed to exploit the property that in many problems, strings tend to have shared prefixes.

Description

The three-way radix quicksort algorithm sorts an array of N (pointers to) strings in lexicographic order. It is assumed that all strings are of equal length K; if the strings are of varying length, they must be padded with extra elements that are less-than any element in the strings.[note 1] The pseudocode for the algorithm is then[note 2]

function sort(a : array of string, d : integer):
    if length(a) ≤ 1 or d ≥ K:
        return
    p ← pivot(a, d)
    i, j ← partition(a, d, p)
    sort(a[0..i], d)
    sort(a[i..j], d+1)
    sort(a[j..length(a)], d)

The pivot function must return a single character. Bentley and Sedgewick suggest either picking the median of a[0][d], ..., a[length(a)−1][d] or some random character in that range.[2] The partition function is a variant of the one used in ordinary three-way quicksort: it rearranges a so that all of a[0], ..., a[i−1] have an element at position d that is less than p, a[i], ..., a[j−1] have p at position d, and strings from j onward have a d'th element larger than p.

Practical implementations of multi-key quicksort can benefit from the same optimizations typically applied to quicksort: median-of-three pivoting, switching to insertion sort for small arrays, etc.[3]

Notes

  1. ^ One way to do so without altering the in-memory representation of the strings is to index them using a function that returns −1 or some other small value when the index is out of range.
  2. ^ Arrays and strings are zero-indexed. Array slicing is assumed to be a non-copying, constant-time operation.

References

  1. ^ Public Domain This article incorporates public domain material from Paul E. Black. "multikey Quicksort". Dictionary of Algorithms and Data Structures. NIST.
  2. ^ a b Bentley, Jon; Sedgewick, Robert (1997). Fast algorithms for sorting and searching strings (PDF). Proc. Annual ACM-SIAM Symp. on Discrete Algorithms (SODA). ISBN 0-89871-390-0.
  3. ^ Bentley, Jon; Sedgewick, Robert (1998). "Sorting Strings with Three-Way Radix Quicksort". Dr. Dobb's Journal.