Reputation: 127
Using ElasticSearch in Amazon as search engine. Lately discussed with one of developers tactics for Upsert.
In my view (i am not an well experienced ES Developer) it's ok to have a complex key as _id
, e.g. Result-1
, Data-2
, etc. It will help on Upsert and data deduplication. But concern was raised about key datatype. Long key, such as string, Sha1-digest, hex, etc — could affect search performance, and better to have some short keys or pass it to ES without predefined _id
and deduplicate with document body or some specific properties.
I haven't read anything about ID performance — from Official docs to medium/blogs.
Is the concern right and I should follow it?
Thank you!
Upvotes: 0
Views: 1102
Reputation: 2510
The concern about using custom ID fields is on the indexing phase because with the auto generated ones Elasticsearch can safely index the document without querying for uniqueness. If you are OK with your indexing rate then you should be fine.
If you look in the docs on the Tune for Search speed , there is no advice about using auto generated ids.
Relevant reads.
Upvotes: 1