Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is amazing! truly great explanation in the related blog post here http://willdrevo.com/fingerprinting-and-audio-recognition-wi... Does anyone know of a good explanation of locality sensitive hashing? I know there are other applications.


Shameless self-promotion: in my Stack Overflow answer [1], I reference good introductory LSH papers [2-5].

In short, LSH is an algorithm that hashes points that are nearby in a feature space into the same bin with high probability. Contrast that with cryptographically secure hashes where the tiniest change in the input is designed to yield a completely different hash. The point is that, in domains like multimedia, you want to tolerate some distortions to your signal, e.g. microphone noise, blur, etc. These minor distortions shouldn't affect your characterization of the data, e.g. "is this a guitar", "is this a cat", etc.

The advantages are that it's simple to implement, and it has mathematically provable probability bounds and query complexity.

[1] http://stackoverflow.com/questions/5751114/nearest-neighbors...

[2] http://www.cs.princeton.edu/courses/archive/spr05/cos598E/bi...

[3] http://www.vldb.org/conf/1998/p194.pdf

[4] http://www.vldb.org/conf/1999/P49.pdf

[5] http://web.iitd.ac.in/~sumeet/Slaney2008-LSHTutorial.pdf


Like "hashing", I don't think "locality sensitive hashing" is a specific technique, but more like a type of algorithm.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: