Reputation: 767
Is there a way to use the output of an ElasticSearch script_fields to update another variable in the index?
I have an index in ElasticSearch 1.x which has timestamp enabled, but not stored. (See below for mapping)
This means that the timestamp can be accessed for searches, or using script_fields like -
GET twitter/_search
{
"script_fields": {
"script1": {
"script": "_fields['_timestamp']"
}
}
}
I need to extract this timestamp field, and store it in the index. It is easy enough to write a script to copy any other field e.g. (I am using the update API)
ctx._source.t1=ctx._source.message
But how can I use the value from the script_fields output to update another field in the index? I want the field 'tcopy' to get the value of the timestamp for each document.
Further, I tried to use java to get the values as below, but it returned null.
SearchResponse response = client.prepareSearch("twitter")
.setQuery(QueryBuilders.matchAllQuery())
.addScriptField("test", "doc['_timestamp'].value")
.execute().actionGet();
The mapping
{
"mappings": {
"tweet": {
"_timestamp": {
"enabled": true,
"doc_values" : true
},
"properties": {
"message": {
"type": "string"
},
"user": {
"type": "string"
},
"tcopy": {
"type": "long"
}
}
}
}
}
Upvotes: 0
Views: 1078
Reputation: 767
The _timestamp field can be accessed using java. Then, we can use the Update API to set the new field. The request would look like
SearchResponse response = client.prepareSearch("twitter2")
.setQuery(QueryBuilders.matchAllQuery())
.addScriptField("test", "doc['_timestamp'].value")
.execute().actionGet();
Then I can use UpdateRequestBuilder with a script that uses this value to update the index
Upvotes: 0
Reputation: 217504
You need to do this in two runs:
So to extract the timestamp data from your twitter
index you can for instance use elasticdump like this:
elasticdump \
--input=http://localhost:9200/twitter \
--output=$ \
--searchBody '{"script_fields": {"ts": {"script": "doc._timestamp.value"}}}' > twitter.json
This will produce a file called twitter.json
having the following content:
{"_index":"twitter","_type":"tweet","_id":"1","_score":1,"fields":{"ts":[1496806671021]}}
{"_index":"twitter","_type":"tweet","_id":"2","_score":1,"fields":{"ts":[1496807154630]}}
{"_index":"twitter","_type":"tweet","_id":"3","_score":1,"fields":{"ts":[1496807161591]}}
You can then easily use that file to update your documents. First create a shell script named read.sh
#!/bin/sh
while read LINE; do
INDEX=$(echo "${LINE}" | jq '._index' | sed "s/\"//g");
TYPE=$(echo "${LINE}" | jq '._type' | sed "s/\"//g");
ID=$(echo "${LINE}" | jq '._id' | sed "s/\"//g");
TS=$(echo "${LINE}" | jq '.fields.ts[0]');
curl -XPOST "http://localhost:9200/$INDEX/$TYPE/$ID/_update" -d "{\"doc\":{\"tcopy\":"$TS"}}"
done
And finally you can run it like this:
./read.sh < twitter.json
After the script has finished running, your documents will have a tcopy
field with the _timestamp
value.
Upvotes: 1