ms_27
ms_27

Reputation: 1684

How elastic search handles parallel index refresh requests?

In our project, we are hitting the elastic search's index refresh api after each create/update/delete operation for immediate search availability.

I want to know, how elastic search will perform if multiple parallel requests are made to its refresh api on single index having close to 2.5million documents?

any thoughts? suggestions?

Upvotes: 0

Views: 683

Answers (1)

Pierre Mallet
Pierre Mallet

Reputation: 7221

Refresh is an operation where ElasticSearch asks Lucene shard to commit modification on disk and create a segment. If you ask for a refresh after every operation you will create a huge number of micro-segments.

Too many segments make your search longer as your shard need to sequentially search through all of them in order to return a search result. Also, they consume hardware resources.

Each segment consumes file handles, memory, and CPU cycles. More important, every search request has to check every segment in turn; the more segments there are, the slower the search will be. from the definitive guide

Lucene will merge those segments automatically into bigger segments, but that's also an I/O consuming task.

You can check this for more details

But from my knowledge, a refresh on a 2.5 billion documents index will take the same time in a 2.5k document index. Also, it seems ( from this issue ) that refresh is a non-blocking operation.

But its a bad pattern for an elasticsearch cluster. Are every CUD operation of your application in need for a refresh ?

Upvotes: 1

Related Questions