young_minds1
young_minds1

Reputation: 1471

get the differences in value of documents between two indexes in elasticsearch

I have a use case where in, I have two indexes in elasticsearch index_old, index_new which will have

sample records for index_old

{
  "unique_id": "french_toast",
  "name": "French Toast",
  "description": "French toast is a dish made of sliced bread soaked in beaten eggs, sugar and typically milk, then pan fried",
  "ingredients": ["bread", "eggs", "sugar", "oil", "milk"]
}

{
  "unique_id": "japanese_cheesecake",
  "name": "Japanese cheesecake",
  "description": "Japanese cheesecake is a variety of cheesecake that is usually lighter in texture and less sweet than North American-style cheesecakes",
  "ingredients": ["cream Cheese", "butter", "sugar", "egg"]
}

{
  "unique_id": "kimchi",
  "name": "Kimchi",
  "description": "Kimchi, is a traditional Korean side dish of salted and fermented vegetables, such as napa cabbage and Korean radish",
  "ingredients": ["fermented cabbage", "radish", "cucumber"]
}

{
  "unique_id": "turkish_delight",
  "name": "turkish delight",
  "description": "Turkish delight or lokum is a family of confections based on a gel of starch and sugar",
  "ingredients": ["starch", "sugar"]
}

sample records for index_new

{
  "unique_id": "french_toast",
  "name": "French Toast",
  "description": "French toast is a dish made of sliced bread soaked in beaten eggs, sugar and typically milk, then pan fried",
  "ingredients": ["bread", "eggs", "sugar", "oil", "milk"]
}

{
  "unique_id": "japanese_cheesecake",
  "name": "Japanese cheesecake",
  "description": "Japanese cheesecake also known as soufflé-style cheesecake, cotton cheesecake, or light cheesecake is a variety of cheesecake that is usually lighter in texture and less sweet than North American-style cheesecakes",
  "ingredients": ["cream Cheese", "butter", "sugar", "egg", "butter"]
}

{
  "unique_id": "kimchi",
  "name": "Kimchi",
  "description": "Kimchi, is a traditional Korean side dish of salted and fermented vegetables, such as napa cabbage and Korean radish",
  "ingredients": ["fermented cabbage", "radish", "cucumber", "soya sauce", "ginger", "garlic"]
}

{
  "unique_id": "turkish_delight",
  "name": "turkish delight",
  "description": "Turkish delight or lokum is a family of confections based on a gel of starch and sugar",
  "ingredients": ["starch", "sugar", "pistachios", "dry fruits"],
  "origin": "turkey"
}

Only unique_id french_toast is same, difference in values of other documents between index_old and index_new are

  1. unique_id: japanese_cheesecake =>
    • description is changed
    • ingredients is changed
  2. unique_id: kimchi =>
    • ingredients is changed
  3. unique_id: turkish_delight =>
    • ingredients is changed
    • new field origin is added.

Only way I can think of, is using python script( as I mostly work on python ), query and compare the document values of both the indexes using brute force approach, which includes querying multiple times in elastic search, which I think is not so good way to go.

Is there a way, by which I can get the above differences between the documents of two indexes, either by one or more queries in elasticsearch entirely?

Thanks in advance!

Upvotes: 1

Views: 334

Answers (1)

warkolm
warkolm

Reputation: 2064

there's nothing that can do this natively in Elasticsearch unfortunately. your best bet is to handle it externally in your own code

Upvotes: 1

Related Questions