Reputation: 1317
Hi i am trying to run similarity search on my vector DB , in my db i added some document along with metadata like below
docs = [Document(page_content='The Structure of the Sun to the center.', metadata={'source': '/tmp/tmp3doww_i3/tmp.pdf', 'page': 1}), Document(page_content='\uf0b7 Radiation pressure : it is Credit: NASA', metadata={'source': '/tmp/tmp3doww_i3/tmp.pdf', 'page': 1}), Document(page_content='The Structu regions', metadata={'source': '/tmp/tmp3doww_i3/tmp.pdf', 'page': 2}))]
now i want to run the pre filter on page = 1 and the run similarity
filter = {"bool": {"filter": {"term": {"page": 6}}}} resdocs = opensearch_vector_search.similarity_search(" heterogenous individual growth",filter) resdocs
I am trying to ask related to vector db similarity with filters and i expected the best answers from community
Upvotes: 1
Views: 1238
Reputation: 21
I had the same issue, in order to use metadata filtering you must access the field in metadata dictionary (like metadata.page) and then access the keyword version of the field (like metadata.page.keyword). After this your filter should work has intended. The filter becomes:
{
"bool": {
"filter": {
"term": {
"metadata.page.keyword": 6
}
}
}
}
The reason why you need to add ".keyword" is because when you add a field
to metadata, the OpenSearch is automatically creates field.keyword
and you can check this by either going to DashBoards and checking what fields your index has, or sending a GET request to https://opensearch-instance/some-index-name
and checking the mappings for metadata here.
Some more details on difference between object.name
and object.name.keyword
: topic on opensearch forum, topic on stackoverflow
Upvotes: 2