Anthony
Anthony

Reputation: 658

Bulk update by query in Elastic Search?

I know that Elastic Search does not currently support bulk updating by query because of Lucene, but are there any alternatives that don't involve installing an ElasticSearch extension?

For example, are there any workarounds to performing:

UPDATE users SET temp = 1 WHERE temp = 0;

Using the bulk method? Or some other method that I don't know about?

I'm new to Elastic Search as an entity so I don't know the ins and outs, but I have read a lot about its ability to update one at a time, but that would be too time consuming with hundreds of thousands of rows.

Just looking for someone to point me in the right direction.

Upvotes: 8

Views: 17496

Answers (3)

spazm
spazm

Reputation: 4809

update_by_query was added to elasticsearch in version 2.3.

The update-by-query API is new and should still be considered experimental. The API may change in ways that are not backwards compatible.

https://www.elastic.co/guide/en/elasticsearch/reference/2.3/docs-update-by-query.html

It seems like you need to write a script for the update portion, so it's a bit of a pain.

UPDATE users SET temp = 1 WHERE temp = 0;

==>

{
    "query": {
        "term": {
            "temp": 0
        }
    },
    "script": {
        "inline": "ctx._source.temp = 1"
    }
}

Note: For this inline script version to work, you'll need inline scripts enabled:

script.inline: true
script.indexed: true
script.disable_dynamic: false

Upvotes: 4

datashovel
datashovel

Reputation: 152

I think this is what you're looking for:

http://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

If you would want to write a dynamic "update query" (like your example) you would write a script which would tell ElasticSearch the logic to follow when transforming the values.

There are some useful examples of that here:

http://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html

Upvotes: 0

vierja
vierja

Reputation: 436

Following up on datashovel answer you should use Elasticsearch scrolling API to fetched the desired documents and then using bulk update (or not) update the documents.

Assuming your index is users and doc_type is user that would be something like:

curl -XGET 'localhost:9200/users/user/_search?scroll=1m' -d '
{
    "constant_score": {
        "filter" : {
           "term" : {
               "temp" : 1
           }
        }
    }
}'

Which will return a scroll_id (something like c2Nhbjs2OzM0NDg1ODpzRlBLc0FXNlNyNm5JWUc1) which then you should use for iterating over the results, doing:

curl -XGET  'localhost:9200/_search/scroll?scroll=1m' \
    -d 'c2Nhbjs2OzM0NDg1ODpzRlBLc0FXNlNyNm5JWUc1'

Until there aren't any hits.

While iterating you should create a list for bulk updating, containing all the elements returned by the scrolling.

{ "update" : {"_id" : "1", "_type" : "user", "_index" : "users"} }
{ "doc" : {"temp" : 0} }
{ "update" : {"_id" : "2", "_type" : "user", "_index" : "users"} }
{ "doc" : {"temp" : 0} }
{ "update" : {"_id" : "3", "_type" : "user", "_index" : "users"} }
{ "doc" : {"temp" : 0} }

(You can see more detail on how to do this on the bulk api docs)

I don't know any PHP but the Elasticsearch PHP API Elastica has some helper functions for scrolling and bulk.

Upvotes: 3

Related Questions