Locality-sensitive hashing
Appearance
Locality Sensitive Hashing (LSH) is a method of performing probabalistic dimension reduction of high-dimensional data. The basic idea is to Hash Function the input items so that similar items are mapped to the same buckets with high probability (the number of buckets being much smaller than the universe of possible input items).
Definition
A locality sensitive hashing scheme is defined with respect to a universe of items and a distance Metric (mathematics) . An LSH scheme is a family of hash functions coupled with a distribution over the functions such that a function chosen randomly according to satifies the property that for any .
Applications
LSH has been applied to several problem domains including
- Near Duplicate Algorithms
- Image Similarity Identification
- Gene Expression Similarity Identification
- Audio Similarity Identification
Related Papers
- Charikar, M.S. Estimation Techniques From Rounding Algorithms.
- Broder, A., Charikar, M.S., Frieze, A., Mitzenmacher, M., Independent Permutations