Doug
Doug

Reputation: 7077

Elastic Search: Understanding how to optimize write-heavy operations so read isn't impacted

We've got a NodeJS application with an Elastic Search back-end that 90% of the time is very lightly used, and occasionally is absolutely slammed. For example, on a typical basis it might receive 50-100 read requests in an hour, and 1-2 write requests. At peak times it might receive 50,000 read requests and 30,000 write requests.

During these peak times we're running into a situation where there are so many write requests that the re-indexing, etc; has even the read requests coming to a crawl; which makes the website unresponsive. To handle this type of load, we clearly need to either somehow optimize Elastic Search or re-architect the application, and I'm trying to figure out how best to do that.

What I'd like to understand better is:

1) What is happening on a write operation that seems to kill everything, and what is available in order to optimize or speed that up?

2) I can tell from a code standpoint I can insert more records faster by using bulk operations, but I'm wondering if the way Elastic Search does indexing this is actually less efficient on the system. Should I see significantly better performance (specifically on the read side of things) if we get rid of bulk inserts, or at least make the size of the inserts smaller? Anything that helps me understand how this change might impact things would be helpful.

3) Is there a anyway to divide up the read/write operations so that even if the write-operations are backed up, the read operations still continue to work?

e.g. I was thinking of using a message queue rather than direct Elastic Search inserts, but again, back to question #2, I'm not positive how to optimize this for read operations to continue to work.

e.g. Is there a way to do the inserts into a different cluster than the read, and then merge the data? Would this be more or less efficient.

Thank-you for your help.

Upvotes: 2

Views: 2160

Answers (1)

xeraa
xeraa

Reputation: 10859

  1. Check out the different thread pools — those include index, search, and bulk. The idea of those is that a bulk should not block a query.
  2. Definitely use bulk requests — you will save a lot of network overhead. But benchmark to find the optimal size for your scenario. As mentioned above also find a reasonable refresh interval though that's a trade-off how long it takes until your data will be searchable.
  3. If you have time based data you can try different node types. But if all your writes and reads are going to the same indices, you're out of luck. There is no way to split up into read and write nodes for the same index right now.
  4. Having very spiky load might be a good use case for a queue, but it adds more moving parts and complexity. Depending on your situation it might be the right choice or it could be cheaper to simply overprovision your Elasticsearch cluster for the peak load.
  5. Make sure to get the number of indices and shards right. This applies to every cluster, but is a common pain point.

PS: If you find any tuning suggestions be sure that they apply to your Elasticsearch version. Some settings have changed over time or were removed entirely. And more up to date Elasticsearch versions should generally perform better — if you're not on the latest minor version

Upvotes: 2

Related Questions