Reputation: 83
I have a FAISS index populated with 8M embedding vectors. I don't have the embedding vectors anymore, only the index, and it is expensive to recompute the embeddings.
Can I search the index for the top-k most similar vectors to each of the index's vectors?
To be more concrete, say this is how my index was populated:
d = 1024
N = 100
embeddings = np.random.rand(N, d)
ids = range(N)
index = faiss.index_factory(
d, 'IDMap,Flat', faiss.METRIC_INNER_PRODUCT
)
index.add_with_ids(embeddings, ids)
I would like to get D, I
such that:
D, I = index.search(embeddings, k)
but I don't have access to embeddings
anymore, I only have the index
.
I tried using index.reconstruct()
to get back my (approximated?) embeddings but I run into
RuntimeError: Error in virtual void
faiss::Index::reconstruct(faiss::Index::idx_t, float*) const at /root/miniconda3/conda-bld/faiss-pkg_1613228717761/work/faiss/Index.cpp:57: reconstruct not implemented for this type of index
Upvotes: 2
Views: 883
Reputation: 101
First of all seems like you forgot train()
your embeddings
before add()
it.
What is about your question you can just copy embeddings
before adding it into the index.
Upvotes: 0