Reputation: 49
I am following LangChain's tutorial to create an example selector to automatically select similar examples given an input.
example_selector = SemanticSimilarityExampleSelector.from_examples(
# This is the list of examples available to select from.
examples,
# This is the embedding class used to produce embeddings which are used to measure semantic similarity.
OpenAIEmbeddings(),
# This is the VectorStore class that is used to store the embeddings and do a similarity search over.
Chroma,
# This is the number of examples to produce.
k=1
)
I passed my documents in examples
, however I realized some examples would trigger an OpenAI content filtering error thus I want to remove them from the vectorstore, I couldn't figure out how to do it.
I tried to recreate my example documents and example selector all over again but would love to learn if there's way to remove embeddings from the vectorstore.
Upvotes: 4
Views: 15758
Reputation: 31
If you have to delete all the documents in the vector, you can also do this
vectorstore._client.delete_collection(vectorstore._collection.name)
Upvotes: 0
Reputation: 364
Since you appear to be using ChromaDB, you can use the delete method provided. Most of the databases should have a delete method in langchain.
Below is an example from langchain's official docs (https://python.langchain.com/docs/integrations/vectorstores/chroma#update-and-delete)
# create simple ids
ids = [str(i) for i in range(1, len(docs) + 1)]
# add data
example_db = Chroma.from_documents(docs, embedding_function, ids=ids)
docs = example_db.similarity_search(query)
print(docs[0].metadata)
# delete the last document
print("count before", example_db._collection.count())
example_db._collection.delete(ids=[ids[-1]])
print("count after", example_db._collection.count())
Upvotes: 5