David Kong
David Kong

Reputation: 638

how to copy id field during indexing (elasticsearch)

It's often useful to have the _id as a part of the document. In fact it's advised here: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-id-field.html

But if you do not know the _id prior to document creation, how would you duplicate the _id during indexing? The only way I can think of doing it is using a pipeline but is there a simpler way?

Edit: according to answer below even a pipeline cannot achieve this.

Upvotes: 1

Views: 1239

Answers (2)

Noa
Noa

Reputation: 355

In case someone still looking for Solution to this issue You can do a reindexing with script tag and use the context object to get grab of the _id and matched it the ID in the POCO

POST /_reindex?wait_for_completion=false
{
  "source": {
    "index": "data.dataitems",
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "data.dataitems_new_index_with_id"
  },"script": {
    "source": "ctx._source.id = ctx._id" 
  }
}

Upvotes: 0

ibexit
ibexit

Reputation: 3667

Ingest pipelines (current version 7.9.2) cannot access the _id if the _id is generated. There is a note in the documentation saying:

If you automatically generate document IDs, you cannot use the {{_id}} value in an ingest processor. Elasticsearch assigns auto-generated _id values after ingest.

The copy_to feature also don't work for _id when auto generated. This Information is a little bit hidden here https://github.com/elastic/elasticsearch/issues/6730#issuecomment-103142553

Queries with script_fieldsusing doc['_id'].value is deprecated too.

It seems to me that this is what many of us are looking for, for different reasons, but there is no solution at least I am aware of.

The case is obviously complete different for self generated document id.

Upvotes: 3

Related Questions