Alan
Alan

Reputation: 241

Grouping values in an aggregation

I'm quite new to Elasticsearch.

I have a query that looks like this:

GET animals/_search

{
  "aggregations" : {
    "top_animals" : {
      "terms" : {"field" : "animals", "size" : 10}
    }
  },
  "size" : 0
}

This returns something like:

{
  (...)
  "aggregations": {
    "top_animals": {
      (...)
      "buckets": [
        {
          "key": "dogs",
          "doc_count": 100
        },
        {
          "key": "whales",
          "doc_count": 70
        },
        {
          "key": "dolphins",
          "doc_count": 50
        },
        {
          "key": "cats",
          "doc_count": 10
        }
      ]
    }
  }
}

Now I've been given a list of animals that are equivalent and should be counted together. So "dogs" and "cats" are "pets", and "dolphins" and "whales" are "aquatic_mammals".

I'd like a result like this (note that the results are ordered):

{
  (...)
  "aggregations": {
    "top_animals": {
      (...)
      "buckets": [
        {
          "key": "aquatic_mammals",
          "doc_count": 120
        },
        {
          "key": "pets",
          "doc_count": 110
        }
      ]
    }
  }
}

How should I modify my query?

Thanks!

Upvotes: 2

Views: 51

Answers (1)

Stock Overflaw
Stock Overflaw

Reputation: 3321

If I understand you well, the values pets and aquatic are not part of the stored data?

There's probably a way with a script (which I can't test, so... good luck!), something like:

GET animals/_search

{
  "aggregations" : {
    "top_animals" : {
      "terms" : {
        "field": "animals",
        "script" : {
          "source": """
            if (_value == 'cats' || _value == 'dogs') {
              return 'pets';
            } else if (_value == 'whales' || _value == 'dolphins') {
              return 'aquatic';
            } else {
              return 'alien';
            }
          """,
          "lang": "painless"
        },
        "size" : 10
      }
    }
  },
  "size" : 0
}

Here, _value is set because a "field" is targeted. Check the Terms Aggregation documentation.

It's quite boring to write because switch doesn't seem to exist in their language, but it should do the trick. Also, a more skilled programmer might have shorter/better ways of writing this script: I've never had much use of this "painless" scripting.

Hope this helps. And works. ;)

Upvotes: 1

Related Questions