Ali Hashemi
Ali Hashemi

Reputation: 3408

Elasticsearch Aggregation doesn't work the way it should

I have an index of messages where I store messageHash for each message too. I also have many more fields along with them. There are multiple duplicate message fields in the index e.g. "Hello". I want to retrieve unique messages.

Here is the query I wrote to search unique messages and sort them by date.

{
  "query": {
    "bool": {
      "must": {
        "term": {
          "message": "Hello"
        }
      },
      "must_not": [
        {
          "term": {
            "user1": "guest"
          }
        },
        {
          "term": {
            "user2": "guest"
          }
        }
      ]
    }
  },
  "aggs": {
    "top_messages": {
      "terms": {
        "field": "messageHash"
      },
      "aggs": {
        "top_messages_hits": {
          "top_hits": {
            "sort": [
              {
                "date": {
                  "order": "desc"
                }
              },
              "_score"
            ],
            "size": 1
          }
        }
      }
    }
  }
}

I still get duplicate messages and it's not sorted by date either! It's just like I've not added the aggregation. I can't figure out what's wrong with it.

Upvotes: 1

Views: 71

Answers (1)

Ali Hashemi
Ali Hashemi

Reputation: 3408

In case someone else having same problem... Remember that the response would be something like this:

{
  "took" : X,
  "timed_out" : false,
  "_shards" : {
    "total" : X,
    "successful" : X,
    "failed" : X
  },
  "hits" : {
    "total" : X,
    "max_score" : X,
    "hits" : [
            ....
            ....
           ]
},
  "aggregations" : {
    "top_messages" : {
      "doc_count_error_upper_bound" : X,
      "sum_other_doc_count" : X,
      "buckets" : [
        {
          "key" : "XXXXXXXXXXXXXXXXXXXXXXXX",
          "doc_count" : X,
          "top_messages_hits" : {
            "hits" : {
              "total" : X,
              "max_score" : null,
              "hits" : [
                    .....
                    .....
                   ]
}

You will get the query results without aggregation in the first segment. Just scroll down and check the aggregation segment.

If you don't want the first segment you can pass "size"=0 in the query as documented here: https://www.elastic.co/guide/en/elasticsearch/reference/current/returning-only-agg-results.html

Upvotes: 1

Related Questions