Reputation: 3
I have loaded data in FAISS using the chunks as my data was very large. So, after the process 110 chunks have been made with respective .faiss & .pkl files. I have written this code
import faiss
import os
def merge_faiss_indexes(index_dir, output_path):
subdirs = [os.path.join(index_dir, d) for d in os.listdir(index_dir) if os.path.isdir(os.path.join(index_dir, d))]
print("Subdirectories found:", subdirs)
index_files = []
for subdir in subdirs:
for file in os.listdir(subdir):
if file.endswith('.faiss'):
index_files.append(os.path.join(subdir, file))
if not index_files:
raise ValueError("No FAISS index files found in the directory or subdirectories!")
print(f"Found FAISS index files: {index_files}")
base_index = faiss.read_index(index_files[0])
print(f"Loaded base index: {index_files[0]}")
for index_file in index_files[1:]:
print(f"Merging index: {index_file}")
to_merge = faiss.read_index(index_file)
base_index.merge_from(to_merge)
os.makedirs(os.path.dirname(output_path), exist_ok=True)
faiss.write_index(base_index, output_path)
print(f"Merged index saved to: {output_path}")
if __name__ == "__main__":
index_dir = "./FAISS_ALL_REF"
output_path = "./FAISS_MERGED/merged_index.faiss"
merge_faiss_indexes(index_dir, output_path)
but this process is making only the .faiss file not the .pkl file associated with this. Kindly help to change this code so that it can create both .faiss & .pkl file accordingly.
However I am following this post, but could not able to make using save_local instead of write_index.
How to combine multiple FAISS indexes into one to get a single retriever
Upvotes: 0
Views: 36