Easy way to continue a failed reindex?

Question

I'm currently trying to reindex a large set of data (around 96 million documents) using the Python API, specifically the reindex command.

When running the command I eventually get a timeout error from the bulk command. I've tried setting the bulk_kwargs request_timeout to 24 hours, however it still timesout... after 28 hours and 57 million records loaded. Re-running the reindex will just delete the existing ones and start over.

Regardless of the reason why the error happens (I think I'm having problems with a disk bottleneck which I can fix. There are no out of memory errors) is there any easy way to continue the reindex from where it died?

Easy way to continue a failed reindex?

Answers (1)

Related Questions