Jump to content

Substring index

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by The Anome (talk | contribs) at 00:17, 20 November 2009 (== References == {{reflist}}). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A substring index is a data structure which gives substring search in a text or text collection in sublinear time. If you have a document of length , or a set of documents of total length , you can locate all occurrences of a pattern in time. ( means less than . See Big O notation.)

The phrase full-text index is also often used for an index of all substrings of a text. But is ambiguous, as it is also used for regular word indexes such as inverted files and signature files. See full text search.

Substring indexes include:

References

  1. ^ R. Grossi and J. S. Vitter, Compressed Suffix Arrays and Suffix Trees, with Applications to Text Indexing and String Matching, SIAM Journal on Computing, 35(2), 2005, 378-407.