Reputation: 374
I am combining two embeddings and both are numerical embeddings both having shape (1,1)
combined_embeddings = np.hstack((price_embeddings, location_embeddings))
index = faiss.IndexFlatL2(combined_embeddings.shape[1])
index.add(np.array(combined_embeddings, dtype=np.float32))
Now if I give these two embeddings as input then it works perfectly
input_combined_embedding = np.hstack((input_price,input_location))
distances, indices = index.search(np.array(input_combined_embedding, dtype=np.float32), k=k)
it checks by euclidean distances of the vectors.
But what i want to implement is at run time or dynamically i want to decide on the basis of which embeddings I want to perform the search. For eg. in above scneraio if I want to search on the basis of input_price then FAISS doesn't let me do that because it needs the input vector of same dimension as of index. And I also cannot add padding because that numerical value will somewhere affect the distance calculation.
is there a way to achieve this in FAISS or in any other Vector DB??
Upvotes: 0
Views: 87
Reputation: 1
Instead of storing combined embeddings in one FAISS index, create separate FAISS indexes for price_embeddings and location_embeddings. Then, based on user input, search in the appropriate index.
For example:
import faiss
import numpy as np
#Example embeddings (1D each)
price_embeddings = np.array([[0.5], [0.8], [0.3]], dtype=np.float32)
location_embeddings = np.array([[0.2], [0.9], [0.4]], dtype=np.float32)
#Create separate FAISS indexes
price_index = faiss.IndexFlatL2(1)
location_index = faiss.IndexFlatL2(1)
#Add embeddings to respective indexes
price_index.add(price_embeddings)
location_index.add(location_embeddings)
#Query based on only price
input_price = np.array([[0.55]], dtype=np.float32)
distances, indices = price_index.search(input_price, k=2)
print("Price-based search results:", indices, distances)
#Query based on only location
input_location = np.array([[0.3]], dtype=np.float32)
distances, indices = location_index.search(input_location, k=2)
print("Location-based search results:", indices, distances)
Upvotes: 0