Talk:Locality-sensitive hashing
Just made the page. There are some variations among definitions of LSH - I am using Charikar's. Flamholz 19:40, 6 June 2007 (UTC)
Definition of an LSH
I don't think the current definition really makes sense, although maybe it could be modified a little to work.
Specifically: for a metric phi(x,y), we have phi(x,y)->0 (intuitively) as x->y. But if Pr[h(x)=h(y)] -> 0 as x->y, that's bad! I mean, that is just about the opposite of a locality-sensitive hash.
One fix might be to say Pr[h(a) = h(b)] = 1 - phi(a,b) instead.
Although it would also be nice to allow for general boolean combinations of hashes, such as simultaneously hashing to many different values, and calling it a hit if some combination thereof actually collide. —Preceding unsigned comment added by PhiloMath (talk • contribs) 07:14, 5 December 2007 (UTC)
I totally agree. The definition as it stands is wrong. The phi(a,b) is a similarity not a distance or metric (mathematics). Another error is that the Jaccard_index which is a similarity but is currently is referred to as the "Jaccard distance". Notice that the correct definition looks like a probabilistic version of Injective_mapping. cmobarry (talk) 17:26, 20 December 2007 (UTC)