Haoyuan Ge
Haoyuan Ge

Reputation: 3699

Elasticsearch return context of the result (10 lines before and after the hits)

When querying elasticsearch, only hit documents will be returned. How can I get the context of the hits, which, say, 10 documents before and after the hits?

For example, I have inserted 5 logs into elasticsearch:

{"log": "a"}
{"log": "b"}
{"log": "c"}
{"log": "d"}
{"log": "e"}

I have searched "query": { "match": { "log": "e" } }, and es will return the 5th document. However, I may want the previous 4 logs for debugging, can es return the context?

Upvotes: 1

Views: 2934

Answers (1)

hkulekci
hkulekci

Reputation: 1942

Maybe my answer may not be a complete answer for you but I want to share my opinion to solve this problem.

Firstly, you want to get results like grep application after and before feature. As far as I know, Elasticsearch can find the documents according to terms matched with documents and it does not think about nearest documents according to order of documents. In my opinion, you can solve this issue with two method. First one is populating relative data to a document while ingesting, second is executing second query to find relative data.

For first method, you will have unnecessary duplicate data and this will be cause performance problems or need more storage, cpu or ram, etc. But you can fetch your data and relative ones with one query. To handle this, you can use https://www.elastic.co/guide/en/logstash/current/plugins-codecs-multiline.html filter of logstash while ingest your logs to Elasticsearch.

For second method, you will be so cool :). Nothing will change on your ingesting part but you should change representation part of your application. On the other hand, with ElasticStack 5.4 on Kibana there is a new feature for this, whicsh is Document Context. You can easily reach documents surrounding a specific document. And I have not try yet but I guess, it could be the same as the second method.

Update:

I have checked the Kibana Surrounding Documents feature and it uses search_after api.

Upvotes: 5

Related Questions