Reputation: 1130
I am trying to create a script using the script_score
of the function_score
.
I have several documents whose rankings
field is type="nested"
.
The mapping for the field is:
"rankings": {
"type": "nested",
"properties": {
"rank1": {
"type": "long"
},
"rank2": {
"type": "float"
},
"subject": {
"type": "text"
}
}
}
A sample document is:
"rankings": [
{
"rank1": 1051,
"rank2": 78.5,
"subject": "s1"
},
{
"rank1": 45,
"rank2": 34.7,
"subject": "s2"
}]
What I want to achieve is to iterate over the nested objects of rankings. Actually, I need to use i.e. a for loop in order to find a particular subject
and use the rank1, rank2
to compute something.
So far, I use something like this but it does not seem to work (throwing a Compile error):
"function_score": {
"script_score": {
"script": {
"lang": "painless",
"inline":
"sum = 0;"
"for (item in doc['rankings_cug']) {"
"sum = sum + doc['rankings_cug.rank1'].value;"
"}"
}
}
}
I have also tried the following options:
for
loop using :
instead of in
: for (item:doc['rankings'])
with no success.for
loop using in
but trying to iterate over a specific element of the object, i.e. the rank1
: for (item in doc['rankings.rank1'].values)
, which actually compile but it seems that it finds a zero-length array of rank1
.I have read that _source
element is the one which can return JSON-like objects, but as far as I found out it is not supported in Search queries.
Can you please give me some ideas of how to proceed with that?
Thanks a lot.
Upvotes: 22
Views: 50005
Reputation: 1339
You can access _source via params._source
. This one will work:
PUT /rankings/result/1?refresh
{
"rankings": [
{
"rank1": 1051,
"rank2": 78.5,
"subject": "s1"
},
{
"rank1": 45,
"rank2": 34.7,
"subject": "s2"
}
]
}
POST rankings/_search
POST rankings/_search
{
"query": {
"match": {
"_id": "1"
}
},
"script_fields": {
"script_score": {
"script": {
"lang": "painless",
"inline": "double sum = 0.0; for (item in params._source.rankings) { sum += item.rank2; } return sum;"
}
}
}
}
DELETE rankings
Upvotes: 26
Reputation: 101
For Nested objects in an array, iterated over the items and it worked. Following is my sample data in elasticsearch index:
{
"_index": "activity_index",
"_type": "log",
"_id": "AVjx0UTvgHp45Y_tQP6z",
"_version": 4,
"found": true,
"_source": {
"updated": "2016-12-11T22:56:13.548641",
"task_log": [
{
"week_end_date": "2016-12-11",
"log_hours": 16,
"week_start_date": "2016-12-05"
},
{
"week_start_date": "2016-03-21",
"log_hours": 0,
"week_end_date": "2016-03-27"
},
{
"week_start_date": "2016-04-24",
"log_hours": 0,
"week_end_date": "2016-04-30"
}
],
"created": "2016-12-11T22:56:13.548635",
"userid": 895,
"misc": {
},
"current": false,
"taskid": 1023829
}
}
Here is the "Painless" script to iterate over nested objects:
{
"script": {
"lang": "painless",
"inline":
"boolean contains(def x, def y) {
for (item in x) {
if (item['week_start_date'] == y){
return true
}
}
return false
}
if(!contains(ctx._source.task_log, params.start_time_param) {
ctx._source.task_log.add(params.week_object)
}",
"params": {
"start_time_param": "2016-04-24",
"week_object": {
"week_start_date": "2016-04-24",
"week_end_date": "2016-04-30",
"log_hours": 0
}
}
}
}
Used above script for update: /activity_index/log/AVjx0UTvgHp45Y_tQP6z/_update In the script, created a function called 'contains' with two arguments. Called the function. The old groovy style: ctx._source.task_log.contains() will not work since ES 5.X stores nested objects in a separate document. Hope this helps!`
Upvotes: 7
Reputation: 117
Unfortunately, ElasticSearch scripting in general does not support the ability to access nested documents in this way (including Painless). Perhaps, consider a different structure to your mappings where rankings are stored in multi-valued fields if you need to be able to iterate across them in such a way. Ultimately, the nested data will need to de-normalized and put into the parent documents to be able to gets scores in the way described here.
Upvotes: 10