Reputation: 5191
I created a mapping to index my mongoDb
collection using elastic search. Here is the mapping
properties:
"properties" : {
"address_components" : {
"properties" : {
"_id" : {
"type" : "string"
},
"subLocality1" : {
"type" : "string",
"index" : "not_analyzed"
},
"subLocality2" : {
"type" : "string",
"index" : "not_analyzed"
},
"subLocality3" : {
"type" : "string",
"index" : "not_analyzed"
},
"city" : {
"type" : "string",
"index" : "not_analyzed"
}
}
Now, I want to retrieve overall unique items from these fields: subLocality1
, subLocality2
, subLocality3
, city
.
Also, each of the distinct
value should contain q
as a sub-string.
Distinct item should also contain corresponding city
value.
Example:
"address_components" : {
"subLocality1" : "s1"
"subLocality2" : "s1",
"subLocality3" : "s2",
"city":"a"
}
"address_components" : {
"subLocality1" : "s3"
"subLocality2" : "s1",
"subLocality3" : "s2",
"city":"a"
}
"address_components" : {
"subLocality1" : "s2"
"subLocality2" : "s1",
"subLocality3" : "s4",
"city":"a"
}
For above indexes, the expected result is:
"address_components" : {
"subLocality1" : "s1"
"subLocality2" : "s1",
"subLocality3" : "s2",
"city":"ct1"
}
"address_components" : {
"subLocality1" : "s3"
"subLocality2" : "s1",
"subLocality3" : "s2",
"city":"ct1"
}
"address_components" : {
"subLocality1" : "s2"
"subLocality2" : "s1",
"subLocality3" : "s4",
"city":"ct1"
}
{s1, a}, {s2,a}, {s3,a}, {s4,a},{a,a}
I tried doing it using elastic search terms
aggregation.
GET /rescu/rescu/_search?pretty=true&search_type=count
{
"aggs" : {
"distinct_locations" : {
"terms" : {
"script" : "doc['address_components.subLocality1'].value"
}
}
}
}
But terms
aggregations only applies for single field according to following link.
Upvotes: 6
Views: 15786
Reputation: 5191
I found the answer myself, after going through elastic search api docs. We need to use a script to retrieve terms from multiple fields.
GET /rescu/rescu/_search?pretty=true&search_type=count
{
"aggs": {
"distinct_locations": {
"terms": {
"script": "[doc['address_components.subLocality1'].value,doc['address_components.subLocality2'].value,doc['address_components.subLocality3'].value]",
"size": 5000
}
}
}
}
Upvotes: 7
Reputation: 1728
I came here from Google searching how to do this in a Kibana visualization.
Looks like Ritesh's answer is very helpful there as well.
I wanted to do a Unique Count aggregation on two fields: IPAddress
and Message
.
In Kibana Visualizations, the JSON Input
field
helps you to modify the aggregation part of the query sent to ElasticSearch.
However, you have to extract stuff from Ritesh's answer. It's only the script
part that you need.
In my case:
{
"script": "[doc['extra.IPAddress'].value,doc['extra.Message'].value]"
}
Now, what is really missing here in the documentation is that the script
parameter takes precedence over the field
parameter. This is what happens in Kibana. The field
parameter is sent from the interface, and the script
parameter is sent because you added it in the JSON input
textbox.
Upvotes: 2
Reputation: 175
If you use the query provided by Fuad Efendi:
{
"size": 0,
"aggs": {
"country": {
"terms": {
"field": "country"
},
"aggregations": {
"city": {
"terms": {
"field": "city"
}
}
}
}
}
}
It is important to note that the first aggregation will be scoped to any "query" you add, but the second aggregation on "city" will not and will instead be scoped to the entire database. This might not be what you want.
Personally, I find the answer provided by ritesh_NITW using a script, to have the best result.
Upvotes: 4
Reputation: 137
Here is example with two fields: Country, City. It uses Aggregations by Country, and Sub-Aggregations by City:
{
"size": 0,
"aggs": {
"country": {
"terms": {
"field": "country"
},
"aggregations": {
"city": {
"terms": {
"field": "city"
}
}
}
}
}
}
You can use many layers of sub-aggregations.
Upvotes: 5