Casey Johnson
Casey Johnson

Reputation: 311

Converting Elastic Search field to Array

In elastic search if you have document that has a pre-existing array

"movies": [
     "Back to the Future"
]

And then you update it to add more movies like such

{
  "script" : "ctx._source.movies += tag",
  "params" : {
    "tag" : "Pulp Fiction"
  }      
}

Then the value is added to the field. That works great... but what if the field isn't an arry to start with and instead looks like this

"movies": "Back to the Future"

If you run the same script you will get the following result

"movies":"Back to the FuturePulpFiction"

So my question is how do I take this existing field and "convert" it to an array to tell elastic search that I want to think of it as an array?

Upvotes: 2

Views: 2224

Answers (3)

G0l0s
G0l0s

Reputation: 496

Arrays in documents are java.util.List

Documents with:

  1. empty string
  2. empty list
  3. list of one item
  4. string
  5. null value
PUT /add_item_to_array/_bulk
{"create":{"_id":1}}
{"movies": ""}
{"create":{"_id":2}}
{"movies": []}
{"create":{"_id":3}}
{"movies": ["Back to the Future"]}
{"create":{"_id":4}}
{"movies": "The Fall"}
{"create":{"_id":5}}
{"movies": null}

Update by query

POST /add_item_to_array/_update_by_query
{
    "script" : {
        "source": """
            deffieldValue = ctx._source['movies'];
            if (!(fieldValueinstanceof List)) {
                ctx['_source']['movies'] = [];
                if ((fieldValue != '') && (fieldValue != null)) {
                    ctx['_source']['movies'].add(fieldValue);
                }
            }
            ctx['_source']['movies'].add(params.tag);
        """,
        "params" : {
            "tag" : "Pulp Fiction"
        }
    }
}

Query to watch document contents

GET /add_item_to_array/_search?filter_path=hits.hits

Response

{
    "hits" : {
        "hits" : [
            {
                "_index" : "add_item_to_array",
                "_type" : "_doc",
                "_id" : "1",
                "_score" : 1.0,
                "_source" : {
                    "movies" : [
                        "Pulp Fiction"
                    ]
                }
            },
            {
                "_index" : "add_item_to_array",
                "_type" : "_doc",
                "_id" : "3",
                "_score" : 1.0,
                "_source" : {
                    "movies" : [
                        "Back to the Future",
                        "Pulp Fiction"
                    ]
                }
            },
            {
                "_index" : "add_item_to_array",
                "_type" : "_doc",
                "_id" : "2",
                "_score" : 1.0,
                "_source" : {
                    "movies" : [
                        "Pulp Fiction"
                    ]
                }
            },
            {
                "_index" : "add_item_to_array",
                "_type" : "_doc",
                "_id" : "4",
                "_score" : 1.0,
                "_source" : {
                    "movies" : [
                        "The Fall",
                        "Pulp Fiction"
                    ]
                }
            },
            {
                "_index" : "add_item_to_array",
                "_type" : "_doc",
                "_id" : "5",
                "_score" : 1.0,
                "_source" : {
                    "movies" : [
                        "Pulp Fiction"
                    ]
                }
            }
        ]
    }
}

You can check field existence on one's own

Upvotes: 0

Val
Val

Reputation: 217584

You can use this script instead. It checks whether movies is an array and if not it creates one

{
  "script" : "if (ctx._source.movies.getClass().isArray()) { ctx._source.movies += tag } else { ctx._source.movies = [ctx._source.movies, tag] }",
  "params" : {
    "tag" : "Pulp Fiction"
  }      
}

Another shorter way of doing it is to always assign an array and then "flatten" it using Groovy's Collection.flatten() method

{
  "script" : "ctx._source.movies = [ctx._source.movies, tag].flatten()",
  "params" : {
    "tag" : "Pulp Fiction"
  }      
}

Upvotes: 2

Tammo Heeren
Tammo Heeren

Reputation: 2104

Adding to the answer from Val, if ctx._source.movies does not exist it will add a null to your resulting list. Following is the script I use to do something similar but not include the null.

{
  "script": "if (ctx._source.movie.getClass().isArray()) {ctx._source.event += tag} else if (ctx._source.movie) {ctx._source.movie = [ctx._source.movie, tag]} else {ctx._source.movie=[tag]}",
  "params" : { 
    "tag" : "Pulp Fiction" 
  }
}

Upvotes: 2

Related Questions