user7980
user7980

Reputation: 83

can I 'inner-search' most similar vectors within a FAISS index?

I have a FAISS index populated with 8M embedding vectors. I don't have the embedding vectors anymore, only the index, and it is expensive to recompute the embeddings.

Can I search the index for the top-k most similar vectors to each of the index's vectors?

To be more concrete, say this is how my index was populated:

d = 1024
N = 100
embeddings = np.random.rand(N, d)
ids = range(N)
index = faiss.index_factory(
    d, 'IDMap,Flat', faiss.METRIC_INNER_PRODUCT
)
index.add_with_ids(embeddings, ids)

I would like to get D, I such that:

D, I = index.search(embeddings, k) 

but I don't have access to embeddings anymore, I only have the index.

I tried using index.reconstruct() to get back my (approximated?) embeddings but I run into

RuntimeError: Error in virtual void 
faiss::Index::reconstruct(faiss::Index::idx_t, float*) const at /root/miniconda3/conda-bld/faiss-pkg_1613228717761/work/faiss/Index.cpp:57: reconstruct not implemented for this type of index

Upvotes: 2

Views: 883

Answers (1)

Fedor
Fedor

Reputation: 101

First of all seems like you forgot train() your embeddings before add() it.

What is about your question you can just copy embeddings before adding it into the index.

Upvotes: 0

Related Questions