Reputation: 191
I need to re-index all my documents to a new index with updated mappings and a different index settings such as number of shards.
The events are published in a Kafka topic and then consumed by a service which push that event to elastic search. So, I don't want to stop consuming the events while re-indexing.
To achieve this, I have kept primaryIndex
(name of the old index) and secondaryIndex
(name of the new index) in application.properties
of a spring app. So while indexing document, application will write the events to both indices (primary and secondary) and read from primary index only. Now I will run _reindex
API to move documents from old index to a new index. As re-indexing will last for about 4-5 days, an event may get overridden by the _reindex
API which I want to avoid.
How can I ensure my documents are not being overridden by _reindex
API ?
Once re-indexing is done, I can remove secondary index from my application properties and will replace primaryIndex
with new index name and then reading part can also be done from the new index.
Or is there any better approach to achieve the same?
Upvotes: 0
Views: 983
Reputation: 109
You can instruct _reindex
API to move documents to new index only when it is not present in the new index. If a document is already present in new index, that can either be a new event or an update event which you don't want to get overridden.
You can give op_type: 'create'
in the reindex API.
For more info, please follow the link https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
Hope this answers your question :)
Upvotes: 1