> Apart from search, ANNs can be use for recommendations, classification, and ot...

rpedela · on Aug 12, 2021

Google has a distributed embedding matching service in preview: https://cloud.google.com/vertex-ai/docs/matching-engine/over...

I guess it depends on what you mean by "simple". The algorithms are complex but there are good tools that implement them. I would imagine smaller companies would use off the shelf tooling, and I would argue that is simpler. Vector embeddings are so unbelievably powerful and often yield better results than classical methods with one of the good tools + pretrained embeddings.

Specifically for search, I use them to completely replace stemming, synonyms, etc in ES. I match the query's embedding to the document embeddings, find the top 1000 or so. Then I ask ES for the BM25 score for that top 1000. I combine the embedding match score with BM25, recency, etc for final rank. The results are so much better than using stemming, etc and it's overall simpler because I can use off the shelf tooling and the data pipeline is simpler.

hintymad · on Aug 12, 2021

> I match the query's embedding to the document embeddings,

I assume the doc size is relatively small, otherwise a document may contain too many different topics that make it hard to differentiate different queries?

rpedela · on Aug 12, 2021

For my search use case, documents are mostly single topic and less than 10 pages. However I have found embeddings still work surprisingly well for longer documents with a few topics in them. But yes, multi-topic documents can certainly be an issue. Segmentation by sentence, paragraph, or page can help here. I believe there are ML-based topic segmentation algorithms too, but that certainly starts making it less simple.

gk1 · on Aug 12, 2021

The moment you cross 10M items or 100 QPS then scaling such a system becomes non-trivial. That's not a high threshold for any enterprise software company handling customer data or any consumer tech company with >10M users. Once you add other requirements to the mix, such as index freshness and metadata filtering, the managed options where this is already built-in start to become compelling even at lower volumes.

Also, Pinecone (disclosure: I work there) has usage-based billing that starts at $72/month, so "paying for the technology" is not that scary.