Bulgur
Bulgur

Reputation: 107

Conditional source filtering of nested objects in elasticsearch responses

I have an index of documents with the following (simplified) structure.

{
  "product_id": "abc123",
  "properties": [
    {
      "key": "width",
      "value": 1000
    },
    {
      "key": "height",
      "value": 2000
    },
    {
      "key": "depth",
      "value": 500
    }
  ]
}

Each document can have hundreds of properties.

Now - i want to be able to search for documents matching a query, and also specify which properties each document should be populated with when returned. So basically i want to write the following request:

Get me all documents that match query x, and populate each document with the properties ["height", "width", "foobar" ].

The array with the properties I want to return is created at query time based on input from the user. The document in the response to the query would look like this:

{
  "product_id": "abc123",
  "properties": [
    {
      "key": "width",
      "value": 1000
    },
    {
      "key": "height",
      "value": 2000
    }
    // No depth!
  ]
}

I have tried to achieve this through source filtering to no avail. I suspect script fields might be the only way to solve this, but I would rather use some standard way. Anyone got any ideas?

Upvotes: 1

Views: 1599

Answers (1)

ollik1
ollik1

Reputation: 4540

The best that I can think of is to use inner_hits. For example:

PUT proptest
{
  "mappings": {
    "default": {
      "properties": {
        "product_id": {
          "type": "keyword"
        },
        "color": {
          "type": "keyword"
        },
        "props": {
          "type": "nested"
        }
      }
    }
  }
}

PUT proptest/default/1
{
  "product_id": "abc123",
  "color": "red",
  "props": [
    {
      "key": "width",
      "value": 1000
    },
    {
      "key": "height",
      "value": 2000
    },
    {
      "key": "depth",
      "value": 500
    }
  ]
}
PUT proptest/default/2
{
  "product_id": "def",
  "color": "red",
  "props": [
  ]
}
PUT proptest/default/3
{
  "product_id": "ghi",
  "color": "blue",
  "props": [
    {
      "key": "width",
      "value": 1000
    },
    {
      "key": "height",
      "value": 2000
    },
    {
      "key": "depth",
      "value": 500
    }
  ]
}

Now we can query by color and fetch only the height, depth and foobar properties:

GET proptest/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "color": {
              "value": "red"
            }
          }
        },
        {
          "bool": {
            "should": [
              {
                "nested": {
                  "path": "props",
                  "query": {
                    "match": {
                      "props.key": "height depth foobar"
                    }
                  },
                  "inner_hits": {}
                }
              },
              {
                "match_all": {}
              }
            ]
          }
        }

      ]
    }
  },
 "_source": {
   "excludes": "props"
 }
} 

The output is

{
  "hits": {
    "total": 2,
    "max_score": 2.2685113,
    "hits": [
      {
        "_index": "proptest",
        "_type": "default",
        "_id": "1",
        "_score": 2.2685113,
        "_source": {
          "color": "red",
          "product_id": "abc123"
        },
        "inner_hits": {
          "props": {
            "hits": {
              "total": 2,
              "max_score": 0.9808292,
              "hits": [
                {
                  "_index": "proptest",
                  "_type": "default",
                  "_id": "1",
                  "_nested": {
                    "field": "props",
                    "offset": 2
                  },
                  "_score": 0.9808292,
                  "_source": {
                    "key": "depth",
                    "value": 500
                  }
                },
                {
                  "_index": "proptest",
                  "_type": "default",
                  "_id": "1",
                  "_nested": {
                    "field": "props",
                    "offset": 1
                  },
                  "_score": 0.9808292,
                  "_source": {
                    "key": "height",
                    "value": 2000
                  }
                }
              ]
            }
          }
        }
      },
      {
        "_index": "proptest",
        "_type": "default",
        "_id": "2",
        "_score": 1.287682,
        "_source": {
          "color": "red",
          "product_id": "def"
        },
        "inner_hits": {
          "props": {
            "hits": {
              "total": 0,
              "max_score": null,
              "hits": []
            }
          }
        }
      }
    ]
  }
}

Note that the results contains both products abc123 and def with the correct properties filtered. Product abc123 matches partially with the given property list, def does not contain any of them. The main results are defined only by the outer query color:red

The drawback of the method is the properties won't be found under the same top level _source but under the inner hits key.

Upvotes: 2

Related Questions