Reputation: 145
I have an index with 88 million docs, 0 replicas, 1 shard on an SSD. When I use the reindex API (with size 3000, refresh_interval -1) it starts getting slower slower as we pass the 50 million mark.
I assume ES is checking if the document exists? Is there a way to reindex and strip old document Ids so ES can generate new ones and index faster?
Also how can I reindex from a specific point? The problem I have is I have to pause my queue of new incoming docs until the reindex is complete, then switch the alias. It would be awesome if I could let the source index still get new docs then later start a new reindex to to move over those those news docs while the big reindex was happening.
Upvotes: 1
Views: 385
Reputation: 145
Added the floowing script to the reindex call to fix the issue:
ctx.remove('_id');
Upvotes: 1