Reputation: 20061
Our indexed documents do not have a completely fixed schema, that is, not every field is in every document. Is there a way to create buckets based on the fields present in a set of documents (i.e. in response to a query) with the count of how many documents contain those fields? For example, these documents that I just made up comprise the results of a query:
{"name":"Bob","field1":"value","field2":"value2","field3":"value3"}
{"name":"Sue","field2":"value4","field3":"value5"}
{"name":"Ali","field1":"value6","field2":"value7"}
{"name":"Joe","field3":"value8"}
This is the information (not format) I want to extract:
name: 4
field1: 2
field2: 3
field3: 3
Is there a way I can aggregate and count to get those results?
Upvotes: 1
Views: 213
Reputation: 52368
Yeah, I think you can do it like this:
GET /some_index/some_type/_search?search_type=count
{
"aggs": {
"name_bucket": {
"filter" : { "exists" : { "field" : "name" } }
},
"field1_bucket": {
"filter" : { "exists" : { "field" : "field1" } }
},
"field2_bucket": {
"filter" : { "exists" : { "field" : "field2" } }
},
"field3_bucket": {
"filter" : { "exists" : { "field" : "field3" } }
}
}
}
And you get something like this:
"aggregations": {
"field3_bucket": {
"doc_count": 3
},
"field1_bucket": {
"doc_count": 2
},
"field2_bucket": {
"doc_count": 3
},
"name_bucket": {
"doc_count": 4
}
}
Upvotes: 1