Reputation: 422
I am performing Elasticsearch operations in Python. I have following fields and data in Elasticsearch index -
sid member_id timestamp
1 1 2020-11-26T13:10:49.899Z
1 1 2020-11-25T14:10:49.899Z
1 2 2020-11-24T15:10:49.899Z
1 2 2020-11-25T16:10:49.899Z
Following is the mapping on these fields -
"sid": {
"type": "integer",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"member_id": {
"type": "integer",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"timestamp": {
"type": "date",
"format": "yyyyMMdd'T'HHmmss.SSSZ"
}
As a result, I want distinct member_id with latest timestamp as following -
member_id timestamp
1 2020-11-26T13:10:49.899Z
2 2020-11-25T16:10:49.899Z
Is it possible in Elasticsearch to get this result? If yes then please tell me how can I do it. Thanks in Advance.
Upvotes: 1
Views: 187
Reputation: 16172
You can use terms aggregation along with max aggregation to achieve your required use case
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"timestamp": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
}
}
}
}
Index Data:
{
"sid": 1,
"member_id": 1,
"timestamp": "2020-11-26T13:10:49.899Z"
}
{
"sid": 1,
"member_id": 1,
"timestamp": "2020-11-25T14:10:49.899Z"
}
{
"sid": 1,
"member_id": 2,
"timestamp": "2020-11-24T15:10:49.899Z"
}
{
"sid": 1,
"member_id": 2,
"timestamp": "2020-11-25T16:10:49.899Z"
}
Search Query:
{
"size": 0,
"aggs": {
"unique_id": {
"terms": {
"field": "member_id",
"order": {
"latestOrder": "desc"
}
},
"aggs": {
"latestOrder": {
"max": {
"field": "timestamp"
}
}
}
}
}
}
Search Result:
"aggregations": {
"unique_id": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 1,
"doc_count": 2,
"latestOrder": {
"value": 1.606396249899E12,
"value_as_string": "2020-11-26T13:10:49.899Z"
}
},
{
"key": 2,
"doc_count": 2,
"latestOrder": {
"value": 1.606320649899E12,
"value_as_string": "2020-11-25T16:10:49.899Z"
}
}
]
}
Upvotes: 1