WoJ
WoJ

Reputation: 30045

How to get the count of a pair of field values?

I need to build a heatmap from the data I have in elasticsearch. The heatmap is the count of cases where two specific fields have the same value. For the data

{'name': 'john', 'age': '10', 'car': 'peugeot'}
{'name': 'john', 'age': '10', 'car': 'audi'}
{'name': 'john', 'age': '12', 'car': 'fiat'}
{'name': 'mary', 'age': '3', 'car': 'mercedes'}

I would like to get the number of unique pairs for the values of name and age. That would be

john, 10, 2
john, 12, 1
mary, 3, 1

I could get all the events and make the count myself but I was hoping that there would be some magical aggregation which could provide that.

It would not be a problem to have it in a nested form, such as

{
  'john':
    {
      '10': 2,
      '12': 1
    },
  'mary':
    {
      '3': 1
    },
}

or whatever is practical.

Upvotes: 0

Views: 1332

Answers (2)

Thomas Decaux
Thomas Decaux

Reputation: 22691

You can use a term aggregation with a script:

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_multi_field_terms_aggregation

Like this you can "concat" what you want such as :

{
    "aggs" : {
        "data" : {
            "terms" : {
                "script" : {
                    "source": "doc['name'].value + doc['name'].age",
                    "lang": "painless"
                }
            }
        }
    }
}

(Not sure about the string concat syntax).

Upvotes: 0

Richa
Richa

Reputation: 7649

You can use Inner aggregation. Use query like

POST count-test/_search
{
"size": 0,
"aggs": {
  "group By Name": {
     "terms": {
        "field": "name"
     },
     "aggs": {
        "group By age": {
           "terms": {
              "field": "age"
             }
           }
         }
       }
     }
   }

Output won't be like as you mentioned but like.

"aggregations": {
  "group By Name": {
     "doc_count_error_upper_bound": 0,
     "sum_other_doc_count": 0,
     "buckets": [
        {
           "key": "john",
           "doc_count": 3,
           "group By age": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                 {
                    "key": "10",
                    "doc_count": 2
                 },
                 {
                    "key": "12",
                    "doc_count": 1
                 }
              ]
           }
        },
        {
           "key": "mary",
           "doc_count": 1,
           "group By age": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                 {
                    "key": "3",
                    "doc_count": 1
                 }
              ]
           }
        }
      ]
    }
 }

Hope this helps!!

Upvotes: 2

Related Questions