Reputation: 1912
I have ES index with the documents in the below formats, these documents are parsed from the logs using Fluentd parser and indexed in ES. Records format:
{"id": "id1", "field1": "f1_val", "message": "XXXX", "time": "XXXX"}
{"id": "id1", "field2": "f2_val", "message": "XXXX", "time": "XXXX"}
{"id": "id1", "field3": "f3_val", "field4": "f4_val", "message": "XXXX", "time": "XXXX"}
I want to group by the id field and group the fields together so that I can visualize the data as a table in the Kibana dashboard as below:
{"id": "id1", "field1": "f1_val", "field2": "f2_val", "field3": "f3_val", "field4": "f4_val"}
in Kibana:
Id Field1 Field2 Field3 Field4 id1 f1_val f2_val f3_val f4_val
How to group by documents the id and select the distinct field values in Elasticsearch. Thanks!
Upvotes: 0
Views: 1201
Reputation: 41
Since Elasticsearch doesn't support joins that well and in extension Kibana aswell I would recommend you to join the documents in your application before putting the documents in the index. If this is not a possibility I would go for a transformation as recommended in:
Using this I could achieve something like this in my dashboard: Result image
Steps to reproduce:
PUT log_index
POST log_index/_doc/ {"id": "1", "field1": "The"}
POST log_index/_doc/ {"id": "1", "field2": "quick"}
POST log_index/_doc/ {"id": "1", "field3": "brown", "field4": "fox"}
POST log_index/_doc/ {"id": "2", "field1": "jumped"}
POST log_index/_doc/ {"id": "2", "field2": "over"}
POST log_index/_doc/ {"id": "2", "field3": "the"}
POST log_index/_doc/ {"id": "2", "field4": "lazy"}
PUT _transform/join_logs
{
"source": {
"index": [
"log_index"
]
},
"pivot": {
"group_by": {
"id.keyword": {
"terms": {
"field": "id.keyword"
}
}
},
"aggregations": {
"field1": {
"scripted_metric": {
"init_script": "state.docs = []",
"map_script": "state.docs.add(new HashMap(params['_source']))",
"combine_script": "for (t in state.docs) { if(t.get('field1') != null){ return t.get('field1')}} return null",
"reduce_script": "states"
}
},
"field2": {
"scripted_metric": {
"init_script": "state.docs = []",
"map_script": "state.docs.add(new HashMap(params['_source']))",
"combine_script": "for (t in state.docs) { if(t.get('field2') != null){ return t.get('field2')}} return null",
"reduce_script": "states"
}
},
"field3": {
"scripted_metric": {
"init_script": "state.docs = []",
"map_script": "state.docs.add(new HashMap(params['_source']))",
"combine_script": "for (t in state.docs) { if(t.get('field3') != null){ return t.get('field3')}} return null",
"reduce_script": "states"
}
},
"field4": {
"scripted_metric": {
"init_script": "state.docs = []",
"map_script": "state.docs.add(new HashMap(params['_source']))",
"combine_script": "for (t in state.docs) { if(t.get('field4') != null){ return t.get('field4')}} return null",
"reduce_script": "states"
}
}
}
},
"dest": {
"index": "joined_index"
}
}
My assumptions doing this was that the fields only occur once in each document with specified id. Don't know what happens if fields overlap between the documents.
Upvotes: 0
Reputation: 3652
{
"size": 0,
"aggs": {
"id_agg": {
"terms": {
"field": "id.keyword"
},
"aggs": {
"by_field1": {
"terms": {
"field": "field1.keyword"
}
},
"by_field2": {
"terms": {
"field": "field2.keyword"
}
},
"by_field3": {
"terms": {
"field": "field3.keyword"
}
}
}
}
}
}
Upvotes: 0