Lebanner
Lebanner

Reputation: 38

ElasticSearch - Unique counts in nested array

For the sake of easier understanding, I will show you how my data is mapped. Here is the template I'm having.

{
    "mappings": 
    {
        "properties": 
        {
            "applicationName": 
            {
                "type": "keyword"
            },
            "tags": 
            {
                "type": "nested",
                "properties": 
                {
                    "tagKey": 
                    {
                        "type": "keyword"
                    },
                    "tagKeyword": 
                    {
                        "type": "keyword"
                    }
                }
            }
        }
    }
}

Here are some sample items,

Sample item 1 
"applicationName": "application1"
"tags": [
         {"tagKey": "user", "tagKeyword": "aaa"},
         {"tagKey": "os", "tagKeyword": "android"}
        ]
Sample item 2 
"applicationName": "application2"
"tags": [
         {"tagKey": "user", "tagKeyword": "bbb"},
         {"tagKey": "os", "tagKeyword": "ios"}
        ]
Sample item 3
"applicationName": "application1"
"tags": [
         {"tagKey": "user", "tagKeyword": "aaa"},
         {"tagKey": "os", "tagKeyword": "pc"}
        ]

I want to retrieve the count of distinct tagKeyword that has tagKey of "user" for each application.

For example,

[
  {
    "applicationName": "application1",
    "distinctUser": 2
  },
  {
    "applicationName": "application2",
    "distinctUser": 1
  }
]

Both solution or URL to the document related to this issue will be appreciated.

Upvotes: 0

Views: 558

Answers (2)

Joe - Check out my books
Joe - Check out my books

Reputation: 16943

You can use a terms aggregation on the applicationName, then filter the user-only tags through a nested filter aggregation:

POST index-name/_search?filter_path=aggregations.*.buckets.key,aggregations.*.buckets.nestedTags.distinctUser
{
  "size": 0,
  "aggs": {
    "distinctAppName": {
      "terms": {
        "field": "applicationName",
        "size": 10
      },
      "aggs": {
        "nestedTags": {
          "nested": {
            "path": "tags"
          },
          "aggs": {
            "distinctUser": {
              "filter": {
                "term": {
                  "tags.tagKey": "user"
                }
              }
            }
          }
        }
      }
    }
  }
}

yielding

{
  "aggregations" : {
    "distinctAppName" : {
      "buckets" : [
        {
          "key" : "application1",
          "nestedTags" : {
            "distinctUser" : {
              "doc_count" : 2
            }
          }
        },
        {
          "key" : "application2",
          "nestedTags" : {
            "distinctUser" : {
              "doc_count" : 1
            }
          }
        }
      ]
    }
  }
}

Upvotes: 2

Azar
Azar

Reputation: 33

Refer nested aggregations. Try term aggregation for the field applicationName to group by applications and then do term sub-aggregation for nested field tags.tagKeyword to get distinct list of values within a given application.

Also you have to add a filter for "tag.tagKey" field as "user" to suit your requirement

Upvotes: 0

Related Questions