Jump to content

Talk:Locality-sensitive hashing

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Clark Mobarry (talk | contribs) at 17:26, 20 December 2007 (Definition of an LSH). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Just made the page. There are some variations among definitions of LSH - I am using Charikar's. Flamholz 19:40, 6 June 2007 (UTC)[reply]

Definition of an LSH

I don't think the current definition really makes sense, although maybe it could be modified a little to work.

Specifically: for a metric phi(x,y), we have phi(x,y)->0 (intuitively) as x->y. But if Pr[h(x)=h(y)] -> 0 as x->y, that's bad! I mean, that is just about the opposite of a locality-sensitive hash.

One fix might be to say Pr[h(a) = h(b)] = 1 - phi(a,b) instead.

Although it would also be nice to allow for general boolean combinations of hashes, such as simultaneously hashing to many different values, and calling it a hit if some combination thereof actually collide. —Preceding unsigned comment added by PhiloMath (talkcontribs) 07:14, 5 December 2007 (UTC)[reply]

I totally agree. The definition as it stands is wrong. The phi(a,b) is a similarity not a distance or metric (mathematics). Another error is that the Jaccard_index which is a similarity but is currently is referred to as the "Jaccard distance". Notice that the correct definition looks like a probabilistic version of Injective_mapping. cmobarry (talk) 17:26, 20 December 2007 (UTC)[reply]