Damodhar
Damodhar

Reputation: 1317

filter metadata with opensearch_vector_search

Hi i am trying to run similarity search on my vector DB , in my db i added some document along with metadata like below

docs = [Document(page_content='The Structure of the Sun to the center.', metadata={'source': '/tmp/tmp3doww_i3/tmp.pdf', 'page': 1}), Document(page_content='\uf0b7 Radiation pressure : it is Credit: NASA', metadata={'source': '/tmp/tmp3doww_i3/tmp.pdf', 'page': 1}), Document(page_content='The Structu regions', metadata={'source': '/tmp/tmp3doww_i3/tmp.pdf', 'page': 2}))]

now i want to run the pre filter on page = 1 and the run similarity

filter = {"bool": {"filter": {"term": {"page": 6}}}} resdocs = opensearch_vector_search.similarity_search(" heterogenous individual growth",filter) resdocs

I am trying to ask related to vector db similarity with filters and i expected the best answers from community

Upvotes: 1

Views: 1238

Answers (1)

Nicola Rosetti
Nicola Rosetti

Reputation: 21

I had the same issue, in order to use metadata filtering you must access the field in metadata dictionary (like metadata.page) and then access the keyword version of the field (like metadata.page.keyword). After this your filter should work has intended. The filter becomes:

{
  "bool": {
    "filter": {
      "term": {
        "metadata.page.keyword": 6
      }
    }
  }
}

The reason why you need to add ".keyword" is because when you add a field to metadata, the OpenSearch is automatically creates field.keyword and you can check this by either going to DashBoards and checking what fields your index has, or sending a GET request to https://opensearch-instance/some-index-name and checking the mappings for metadata here.

Some more details on difference between object.name and object.name.keyword: topic on opensearch forum, topic on stackoverflow

Upvotes: 2

Related Questions