Reputation: 41
How to count number of objects in a nested filed in elastic search? Sample mapping :
"base_keywords": {
"type": "nested",
"properties": {
"base_key": {
"type": "text"
},
"category": {
"type": "text"
},
"created_at": {
"type": "date"
},
"date": {
"type": "date"
},
"rank": {
"type": "integer"
}
}
}
I would like to count number of objects in nested filed 'base_keywords'.
Upvotes: 4
Views: 6906
Reputation: 11
I was trying to do similar to understand production data distribution The following query helped me find top 5
{
"query": {
"match_all": {}
},
"aggs": {
"n_base_keywords": {
"nested": { "path": "base_keywords" },
"aggs": {
"top_count": { "terms": { "field": "_id", "size" : 5 } }
}
}
}
}
Upvotes: 1
Reputation: 12186
You would need to do this with inline script. This is what worked for me: (Using ES 6.x):
GET your-indices/_search
{
"aggs": {
"whatever": {
"sum": {
"script": {
"inline": "params._source.base_keywords.size()"
}
}
}
}
}
Upvotes: 2
Reputation: 116
Aggs are normally good for counting and grouping, for nested documents you can use nested aggs:
"aggs": {
"MyAggregation1": {
"terms": {
"field": "FieldA",
"size": 0
},
"aggs": {
"BaseKeyWords": {
"nested": { "path": "base_keywords" },
"aggs": {
"BaseKeys": {
"terms": {
"field": "base_keywords.base_key.keyword",
"size": 0
}
}
}
}
}
}
}
You don't specify what you want to count, but aggs are quite flexible for grouping and counting data. The "doc_count" and "key" behave similar to an sql group by + count()
Updated (This assumes you have a .keyword field create the "keys" values, since a property of type "text" can't be aggregated or counted:
{
"aggs": {
"MyKeywords1Agg": {
"nested": { "path": "keywords1" },
"aggs": {
"NestedKeywords": {
"terms": {
"field": "keywords1.keys.keyword",
"size": 0
}
}
}
}
}
}
For simply counting the number of nested keys you could simply do this:
{
"aggs": {
"MyKeywords1Agg": {
"nested": { "path": "keywords1" }
}
}
}
If you want to get some grouping on the field values on the "main" document or the nested documents, you will have to extend your mapping / data model to include terms that are aggregatable, which includes most data types in elasticsearch except "text", ex.: dates, numbers, geolocations, keywords.
Edit: Example with aggregating on a unique identifier for each top level document, assuming you have a property on it called "WordMappingId" of type integer
{
"aggs": {
"word_maping_agg": {
"terms": {
"field": "WordMappingId",
"size": 0,
"missing": -1
},
"aggs": {
"Keywords1Agg": null,
"nested": { "path": "keywords1" }
}
}
}
}
If you don't add any properties to the "word_maping" document on the top level there is no way to do an aggregation for each unique document. The builtin _id field is by default not aggregateable, and I suggest you include a unique identifier from the source data on the top level to aggregate on.
Note: the "missing" parameter will put all documents that don't have the WordMappingId property set in a bucked with the supplied value, this makes sure you're not missing any documents in the search results.
Aggs can support a behaviour similar to a group by in SQL, but you need something to actually group it by, and according to the mapping you supplied there are no such fields currently in your index.
Upvotes: 1