Jump to content

Locality-sensitive hashing

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Flamholz (talk | contribs) at 19:06, 6 June 2007 (Created page with ''''Locality Sensitive Hashing''' ('''LSH''') is a method of performing probabalistic dimension reduction of high-dimensional data. The basic idea is to [[hash|H...'). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Locality Sensitive Hashing (LSH) is a method of performing probabalistic dimension reduction of high-dimensional data. The basic idea is to Hash Function the input items so that similar items are mapped to the same buckets with high probability (the number of buckets being much smaller than the universe of possible input items).

Definition

A locality sensitive hashing scheme is defined with respect to a universe of items and a distance Metric (mathematics) . An LSH scheme is a family of hash functions coupled with a distribution over the functions such that a function chosen randomly according to satifies the property that for any .

Applications

LSH has been applied to several problem domains including

  • Near Duplicate Algorithms
  • Image Similarity Identification
  • Gene Expression Similarity Identification
  • Audio Similarity Identification