cjbottaro
cjbottaro

Reputation: 856

Elasticsearch: filter top hits aggregation

Say I have an Elasticsearch index with bunch of users' comments:

{ "name": "chris", "date": "2016-01-01", "msg": "hi, foo"}
{ "name": "chris", "date": "2016-01-05", "msg": "bye, bar"}
{ "name": "aaron", "date": "2016-01-10", "msg": "who's bar"}
{ "name": "aaron", "date": "2016-01-15", "msg": "not foo"}

First, I want to find the lastest comment for each user. I can do that with the top_hits aggregation:

"aggs": {
    "name": {
      "terms": { "field": "name" },
      "aggs": {
        "latest_comment": {
          "top_hits": {
            "sort": [ {"date": { "order": "desc" } } ],
            "size": 1
            }
          }
        }
      }
    }
  }

Which effectively gives me the following:

{ "name": "chris", "date": "2016-01-05", "msg": "bye, bar"}
{ "name": "aaron", "date": "2016-01-15", "msg": "not foo"}

But how can I filter those results now?? And to be super clear, I want to filter after the top_hits aggregation has picked the latest hits, not before.

Thank you.

Upvotes: 8

Views: 4882

Answers (1)

hossein shemshadi
hossein shemshadi

Reputation: 303

I had the exact question. The result after a lot of search was this:

If you want to filter the top hits results based on a numeric metric, you can use pipeline aggregations like bucket selector. This way is somehow implementing a SQL HAVING in elasticsearch. a very helpful answer for this case can be find implementing HAVING in elasticsearch

But if your metric to filter is not numeric there is no way (at least until v 6.2.4) to do that in elasticsearch side.

In this case as @ismail said you need to do that in client-side by your software.

Upvotes: 1

Related Questions