wolfgang
wolfgang

Reputation: 7789

Elasticsearch/Python - Re-index data after changing the mappings?

I'm a little stuck on how to re-index data in elastic search after a mapping or a data type has been changed.

According to elastic search docs

Pull the documents in from your old index, using a scrolled search and index them into the new index using the bulk API. Many of the client APIs provide a reindex() method which will do all of this for you. Once you are done, you can delete the old index.

This is my old mapping

{
  "test-index2": {
    "mappings": {
      "business": {
        "properties": {
          "address": {
            "type": "nested",
            "properties": {
              "country": {
                "type": "string"
              },
              "full_address": {
                "type": "string"
              }
            }
          }
        }
      }
    }
  }
}

New Index mapping, I'm changing full_address -> location_address

{
  "test-index2": {
    "mappings": {
      "business": {
        "properties": {
          "address": {
            "type": "nested",
            "properties": {
              "country": {
                "type": "string"
              },
              "location_address": {
                "type": "string"
              }
            }
          }
        }
      }
    }
  }
}

I'm using the python client for elasticsearch

https://elasticsearch-py.readthedocs.org/en/master/helpers.html#elasticsearch.helpers.reindex

from elasticsearch import Elasticsearch
from elasticsearch.helpers import reindex
es = Elasticsearch(["es.node1"])

reindex(es, "source_index", "target_index")

However this transfers the data from one index to another.

How may i use this to change the mappings/(data types etc) for my case above?

Upvotes: 4

Views: 4231

Answers (3)

Chitra
Chitra

Reputation: 1404

After updating the mapping, this can be done by updating the exiting documents using bulk API.

POST /_bulk {"update":{"_id":"59519","_type":"asset","_index":"assets"}} {"doc":{"facility_id":491},"detect_noop":false}

Note - Use 'detect_noop' for detecting the noop update.

Upvotes: 0

wolfgang
wolfgang

Reputation: 7789

It's Straightforward if you use the scan&scroll and the Bulk API already implemented in the python client of elasticsearch

First -> Fetch all the documents by scan&scroll method

Loop through and make neccessary modifications to each document

Insert the modified documents into a new index using the Bulk API

from elasticsearch import Elasticsearch, helpers

es = Elasticsearch()

# Use the scan&scroll method to fetch all documents from your old index

res = helpers.scan(es, query={
  "query": {
    "match_all": {}

  },
  "size":1000 
},index="old_index")


new_insert_data = []

# Change the mapping and everything else by looping through all your documents

for x in res:
    x['_index'] = 'new_index'
    # Change "address" to "location_address"
    x['_source']['location_address'] = x['_source']['address']
    del x['_source']['address']
    # This is a useless field
    del x['_score']
    es.indices.refresh(index="testing_index3")

    # Add the new data into a list
    new_insert_data.append(x)





es.indices.refresh(index="new_index")
print new_insert_data

#Use the Bulk API to insert the list of your modified documents into the database
helpers.bulk(es,new_insert_data)

Upvotes: 4

bittusarkar
bittusarkar

Reputation: 6357

The reindex() API simply "moves" documents from one index to another. There is no way it can detect/infer that the field name full_address in documents of the old index should be location_address in documents in the new index. I doubt there is any API provided by standard Elasticsearch clients that can do what you desire. The only way I can think of achieving this is through additional custom logic on the client side which maintains a dictionary of field names from old index to new index and then read documents from old index and indexes the corresponding document to the new index with new field names obtained from the field name dictionary.

Upvotes: 0

Related Questions