david shcmidt
david shcmidt

Reputation: 31

Get the number of unique terms in a field in elasticsearch

Here are some sample documents that I have

doc1

{
"occassion" : "Birthday",
"dessert": "gingerbread"
}

doc2

 {
"occassion" : "Wedding",
"dessert": "friand"
}

doc3

{
"occassion":"Bethrothal" ,
"dessert":"gingerbread"
}

When I give simple terms aggregation, on the field "dessert", i get like the results like below

"aggregations": {
  "desserts": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
      {
        "key": "gingerbread",
        "doc_count": 2
      },
      {
        "key": "friand",
        "doc_count": 1
      }
    ]
  }
}
}

But if the issue here is if there are many documents and I need to know how many unique keywords were existing under the field name "desserts",it would take me a lot of time to figure it out. Is there a work around to get just the number of unique terms under the specified field name?

Upvotes: 3

Views: 95

Answers (2)

Vineeth Mohan
Vineeth Mohan

Reputation: 19263

I would suggest cardinality with higher precision_threshold for accurate result.

GET /cars/transactions/_search
{
    "size" : 0,
    "aggs" : {
        "count_distinct_desserts" : {
            "cardinality" : {
              "field" : "dessert",
              "precision_threshold" : 100 
            }
        }
    }
}

Upvotes: 0

Mihai Ionescu
Mihai Ionescu

Reputation: 978

The cardinality aggregation seems to be what you're looking for: https://www.elastic.co/guide/en/elasticsearch/guide/current/cardinality.html

Querying this:

{
    "size" : 0,
    "aggs" : {
        "distinct_desserts" : {
            "cardinality" : {
              "field" : "dessert"
            }
        }
    }
}

Would return something like this:

"aggregations": {
  "distinct_desserts": {
     "value": 2
  }
}

Upvotes: 2

Related Questions