A l w a y s S u n n y
A l w a y s S u n n y

Reputation: 38502

Delete only fields from all the documents that match my criteria

To delete any field with value from all the documents without deleting the whole document from the Elasticsearch index usually I use _update_by_query API call with script

For Ex: on my_properties index I want to delete all the fields e.g (key=>value) from all the documents where it exists. Using below query I'm able to delete all the feed field where it is exists but it this case the feed field is text/string type

Index: my_properties

Field: feed

Type: text

Example Value on feed: feed: ["AB-1234"]

POST my_properties/_update_by_query?refresh&conflicts=proceed
{
    "script" : "ctx._source.remove('feed')",
    "query" : {
        "exists": { "field": "feed" }
    }
}

My main problem is when my field type is nested instead of text/string

Index: my_properties

Field: feed_v2

Type: nested

Example Value on feed_v2: feed_v2: [{"feed":12},{"id":["AB-9999"]}]

Approach 1:

POST my_properties/_update_by_query?refresh&conflicts=proceed
{
    "script" : "ctx._source.remove('feed_v2')",
    "query" : {
        "exists": { "field": "feed_v2" }
    }
}

Approach 2:

POST my_properties/_update_by_query?refresh&conflicts=proceed
{
    "script" : "ctx._source.feed_v2.remove('feed')",
    "query" : {
        "exists": { "field": "feed_v2.feed" }
    }
}

Nothing works, Am I missing something? Not sure but my guess is-

"query" : {"exists": { "field": "feed_v2" }}

query exists doesn't work same way with nested type field that's why it doesn't find anything while trying delete on nested type field

As per the Ref: https://stackoverflow.com/a/53771354/1138192 It should work for me but alas it doesn't work for me.

Upvotes: 2

Views: 1020

Answers (1)

Kamal Kunjapur
Kamal Kunjapur

Reputation: 8840

Elasticsearch has a concept of Nested Datatype, and for that only Nested Query would come to help. Basically your exists query would be in the below form:

Nested Query to check if field Exists:

POST <your_index_name>/_search
{
  "query": {
    "nested": {
      "path": "feed_v2",
      "query": {
        "exists": {
          "field": "feed_v2.feed"
        }
      }
    }
  }
} 

You are looking for query that would help you delete a nested field from a nested document for which I've come up with the below script.

How nested documents differ from normal documents, according to this link is, each nested object is indexed as a hidden separate document hence the query that you are making use of, doesn't work.

Update by query script for nested document to delete a field

This is for scenario you only want to delete feed i.e. field_v2.feed but you would want to preserve rest of the field_v2 fields.

POST resumes/_update_by_query
{
  "query": {
    "match_all": {}
  },
  "script": {
    "lang": "painless",
    "inline": """
      for(int i=0; i<ctx._source.field_v2.size(); i++)
      {
        HashMap myKV = ctx._source.field_v2.get(i);
        if(myKV.get(params.key_ref)!=null){
          myKV.remove(params.key_ref);
        }
      }
    """,
    "params": {
      "key_ref": "feed"
  }}
}

Hope this helps!

Upvotes: 2

Related Questions