Dibish
Dibish

Reputation: 9293

Elasticsearch getting count of distinct rows

I need to find the count of distinct field IDs using elasticsearch

My data format is

{
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "16bcd4dc080f4c789018dd97f76741ef",
            "_score": 1,
            "_source": {
               "first_name": "jinu",
               "team_id": "500"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "9ed8afe738aa63c28b66994cef1f83c6",
            "_score": 1,
            "_source": {
               "first_name": "lal",
               "team_id": "500"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "1d238cd2f8c06790fc20859a16e3183b",
            "_score": 1,
            "_source": {
               "first_name": "author1",
               "team_id": "500"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "616ee1c00a02564f71bb6c3067054d55",
            "_score": 1,
            "_source": {
               "first_name": "kannan",
               "team_id": "400"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "d48132bfaed792f3c32d12e310d41c87",
            "_score": 1,
            "_source": {
               "first_name": "author3",
               "team_id": "400"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "1a9d05586a8dc3f29b4c8147997391f9",
            "_score": 1,
            "_source": {
               "first_name": "dibish",
               "team_id": "100"
            }
         }

      ]
   } 

Here there are three distinct team_ids: 500, 400, 100. In this case I want to get the count as 3. I have tried cardinality aggregation:

{
  "size": 0, 
    "query" : {
        "match_all" : {  }
    },
    "aggs" : {
        "team_id_count" : {
            "cardinality" : {
                "field" : "team_id"
            }
        }
    }

}

Here am getting the correct result but I can see that elasticsearch documentation states that cardinality is an experimental feature and it may be subject to change in future.

Is there any way to achieve this without using cardinality aggregation? Is there any problem to use this experimental cardinality function? Please guide me in the right direction.

Upvotes: 0

Views: 1333

Answers (1)

Mustafa
Mustafa

Reputation: 407

You could use terms aggregation

Like this:

curl -XPOST http://localhost:9200/outboxprov1/user/_search -d '
{
  "size": 0,
    "query" : {
        "match_all" : {  }
    },
    "aggs" : {
        "team_id_count" : {
            "terms" : {
                "field" : "team_id"
            }
        }
    }

}'

Upvotes: 2

Related Questions