Reputation: 172
I have an ES index with events (logs) and I want to search occurences where there is 1 event of type A followed by 1 event of type B within the next 5 mins. I'm quite new to ES so I'm sure what is the best way to achieve that, I think the aggregations might be a good way but I don't see any that fit this needs.
Example:
I have the following events
{ id: 1, timestamp: "2019-11-08 10:00", type: "A" },
{ id: 2, timestamp: "2019-11-08 10:01", type: "B" },
{ id: 3, timestamp: "2019-11-08 10:07", type: "A" },
{ id: 4, timestamp: "2019-11-08 10:10", type: "B" },
{ id: 5, timestamp: "2019-11-08 10:20", type: "B" }
I would like to find a way to output "correlated" events such as (the output format here is not import, I can adapt if needed, only the "correlation" information matters)
{ "id" : [1, 2] },
{ "id" : [3, 4] }
because events 1 and 2 occured within 5 mins of each other and 3 and 4 too. Event 5 is not "correlated" to any other event so it is not in the results
Upvotes: 0
Views: 148
Reputation: 217494
For starters, you could leverage the date_histogram
aggregation.
First, index some documents:
POST test/_doc/_bulk
{"index":{"_id": "1"}}
{ "id": 1, "timestamp": "2019-11-08T10:00:00", "type": "A" }
{"index":{"_id": "2"}}
{ "id": 2, "timestamp": "2019-11-08T10:01:00", "type": "B" }
{"index":{"_id": "3"}}
{ "id": 3, "timestamp": "2019-11-08T10:07:00", "type": "A" }
{"index":{"_id": "4"}}
{ "id": 4, "timestamp": "2019-11-08T10:09:00", "type": "B" }
{"index":{"_id": "5"}}
{ "id": 5, "timestamp": "2019-11-08T10:20:00", "type": "B" }
Then run a query that aggregates documents on 5 minutes intervals:
POST test/_search
{
"size": 0,
"aggs": {
"history": {
"date_histogram": {
"field": "timestamp",
"interval": "5m",
"min_doc_count": 1
},
"aggs": {
"hits": {
"top_hits": {
"_source": false
}
}
}
}
}
}
You'll see the results you expect. In the first bucket, you'll see documents 1 and 2, in the second bucket documents 3 and 4 and document 5 in a third bucket.
Upvotes: 1