Nikhil
Nikhil

Reputation: 374

FAISS Db Vector Search

I am combining two embeddings and both are numerical embeddings both having shape (1,1)

combined_embeddings = np.hstack((price_embeddings, location_embeddings))
index = faiss.IndexFlatL2(combined_embeddings.shape[1])
index.add(np.array(combined_embeddings, dtype=np.float32))

Now if I give these two embeddings as input then it works perfectly

input_combined_embedding = np.hstack((input_price,input_location))
distances, indices = index.search(np.array(input_combined_embedding, dtype=np.float32), k=k)

it checks by euclidean distances of the vectors.

But what i want to implement is at run time or dynamically i want to decide on the basis of which embeddings I want to perform the search. For eg. in above scneraio if I want to search on the basis of input_price then FAISS doesn't let me do that because it needs the input vector of same dimension as of index. And I also cannot add padding because that numerical value will somewhere affect the distance calculation.

is there a way to achieve this in FAISS or in any other Vector DB??

Upvotes: 0

Views: 87

Answers (1)

KraYu
KraYu

Reputation: 1

Instead of storing combined embeddings in one FAISS index, create separate FAISS indexes for price_embeddings and location_embeddings. Then, based on user input, search in the appropriate index.

For example:

import faiss
import numpy as np

#Example embeddings (1D each)
price_embeddings = np.array([[0.5], [0.8], [0.3]], dtype=np.float32)
location_embeddings = np.array([[0.2], [0.9], [0.4]], dtype=np.float32)

#Create separate FAISS indexes
price_index = faiss.IndexFlatL2(1)
location_index = faiss.IndexFlatL2(1)

#Add embeddings to respective indexes
price_index.add(price_embeddings)
location_index.add(location_embeddings)

#Query based on only price
input_price = np.array([[0.55]], dtype=np.float32)
distances, indices = price_index.search(input_price, k=2)

print("Price-based search results:", indices, distances)

#Query based on only location
input_location = np.array([[0.3]], dtype=np.float32)
distances, indices = location_index.search(input_location, k=2)

print("Location-based search results:", indices, distances)

Upvotes: 0

Related Questions