View source
Views
Page
Discussion
View source
History
From The Similarity Search Wiki
for
Main Page
Jump to:
navigation
,
search
[[Open Problems]] || [[Links]] || [[Researchers]] || [[Bibliography]] <big> The Similarity Search Wiki</big> The purpose of this wiki is to collect links, people, ideas, keywords, papers, slides, code and data sets on nearest neighbors in a single place. == Definition == To preprocess a database of <math>N</math> objects so that given a query object, one can effectively determine its nearest neighbors in database. == Synonyms == Name of the problem: nearest neighbors, <math>k</math> nearest neighbors (<math>k</math>NN, <math>k</math>-NN), nearset neighbor search, proximity search, similarity search, approximate nearest neighbors (ANN), range queries, maximal intersection queries, post-office problem, partial match, best match file searching, best match retrieval, sequence nearest neighbors (SNN). == Solution concepts == locality-sensitive hashing (LSH), low-distortion embeddings, k-d trees, kd-trees, metric trees, M-trees, R*-trees, vp-trees, vantage point trees, vantage point forest, multi-vantage point tree, bisector trees, Orchard's algorithm, random projections, fixed queries tree, Voronoi tree, BBD-tree, min-wise independent permutations, Burkhard-Keller tree, generalized hyperplane tree, geometric near-neighbor access tree (GNAT), approximating eliminating search algorithm (AESA), inverted index, spatial approximation tree (SAT), sketches. == Applications == <math>k</math>-nearest neighbor classification algorithm, image similarity identification, audio similarity identification, fingerprint search, audio/video compression (MPEG), optical character recognition, coding theory, function approximation, recommendation systems, near-duplicate detection, targeting on-line ads, distributional similarity computation, spelling correction, nearest neighbor interpolation. == Industrial Applications == Companies that tackle big data problems with similarity exist: * [http://www.simmachines.com simMachines] offers a high performance similarity search engine for large datasets along with support and consulting services. == Related keywords == All-nearest-neighbors problem, indexing methods, spatial index, Voronoi diagram, spatial access methods (SAM), multidimensional access methods, closest pair, indexing algorithm, intrinsic dimension, Johnson-Lindenstrauss lemma, Johnson-Lindenstrauss transform, large-scale algorithms, scalability, dimensionality reduction, high dimensions, curse of dimensionality, high-dimensional spaces, cell probe model, metric spaces, Euclidean space, brunch-and-bound search, divide and conquer, massive data sets, metric embeddings, cell probe complexity, spatial data structures, Euclidean <math>k</math>-median problem, point sampling. == Call for promotion and feedback == Did you find a mistake or do you have a suggestion/addition? You can '''edit the wiki by yourself''', or you can <enkodemail to="arnoldoMuller@gmail.com">e-mail</enkodemail> the [[User:Amuller|webmaster]] with your suggestions/corrections/additions. The documentation for editing the wiki can be found [http://www.mediawiki.org/wiki/Help:Contents here]. Thanks for visiting this wiki!
Return to
Main Page
.
Navigation
Main Page
Bibliography
Researchers
General Links
Implementations
Datasets
Open Problems
Tutorials
Help
Random page
Categories
Search
Toolbox
What links here
Related changes
Special pages
Personal tools
Log in