user3070752
user3070752

Reputation: 734

ElasticSearch Timeout Error: ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=60))

I have an instance of ElasticSearch running on a server. When I try to index a huge corpus using multiprocessing, I get a lot of timeout errors. It seems that the EasticSearch can handle only a few numbers of requests. I've followed the configuration suggested in the ElasticSearch website. Are there any suggestions on what should I do to increase its indexing performance for a multiprocessing setting? The index that I'm adding documents to has one shard.

Upvotes: 1

Views: 2873

Answers (1)

Saeed Nasehi
Saeed Nasehi

Reputation: 1000

There are plenty of works that you can do.

  • First, you need to set refresh_interval. Refresh interval is the time that the added document will become available for search. If you can tolerate set it to at least 30 seconds or -1. I have read that this will increase the indexing performance by about 70%.

  • The second thing that you can try is to use bulk index API instead of a single document indexing.

  • Disabling swap can make an upper performance for you in some special cases.

  • One of the other options that you can try is to increase the RAM size that you have assigned to your elasticsearch;

  • Finally, increasing the size of HEAP to be used for indexing can increase the writing performance. the default size is 10 percent of all heap size.

I hope these points could help you.

Upvotes: 1

Related Questions