Dilpreet Singh
Dilpreet Singh

Reputation: 3

Merge multiple FAISS chunks in a single chunk

I have loaded data in FAISS using the chunks as my data was very large. So, after the process 110 chunks have been made with respective .faiss & .pkl files. I have written this code

import faiss
import os

def merge_faiss_indexes(index_dir, output_path):
    
    subdirs = [os.path.join(index_dir, d) for d in os.listdir(index_dir) if os.path.isdir(os.path.join(index_dir, d))]

    print("Subdirectories found:", subdirs)

    
    index_files = []
    for subdir in subdirs:
        for file in os.listdir(subdir):
            if file.endswith('.faiss'):
                index_files.append(os.path.join(subdir, file))

    if not index_files:
        raise ValueError("No FAISS index files found in the directory or subdirectories!")

    print(f"Found FAISS index files: {index_files}")

    
    base_index = faiss.read_index(index_files[0])
    print(f"Loaded base index: {index_files[0]}")

    
    for index_file in index_files[1:]:
        print(f"Merging index: {index_file}")
        to_merge = faiss.read_index(index_file)
        base_index.merge_from(to_merge)

   
    os.makedirs(os.path.dirname(output_path), exist_ok=True)
    faiss.write_index(base_index, output_path)
    print(f"Merged index saved to: {output_path}")

if __name__ == "__main__":

    index_dir = "./FAISS_ALL_REF" 
    output_path = "./FAISS_MERGED/merged_index.faiss"  

    merge_faiss_indexes(index_dir, output_path)




but this process is making only the .faiss file not the .pkl file associated with this. Kindly help to change this code so that it can create both .faiss & .pkl file accordingly.

However I am following this post, but could not able to make using save_local instead of write_index.

How to combine multiple FAISS indexes into one to get a single retriever

Upvotes: 0

Views: 36

Answers (0)

Related Questions