Reputation: 1947
I'm facing a problem where I have two documents each containing an array of objects. I like to search for one document containing two properties for a nested object (matching both at the same time in the same object) but I always get both documents.
I created the documents with:
POST /respondereval/_doc
{
"resp_id": "1236",
"responses": [
{"key": "meta","text":"abc"},
{"key": "property 1", "text": "yes"},
{"key": "property 2", "text": "yes"},
]
}
POST /respondereval/_doc
{
"resp_id": "1237",
"responses": [
{"key": "meta","text":"abc"},
{"key": "property 1", "text": "no"},
{"key": "property 2", "text": "yes"},
]
}
I defined an index for them to prevent ES to flat out the objects like this:
PUT /respondereval
{
"mappings" : {
"properties": {
"responses" : {
"type": "nested"
}
}
}
}
I now like to search for the first document (resp_id 1236
) with the following query:
GET /respondereval/_search
{
"query": {
"nested": {
"path": "responses",
"query": {
"bool": {
"must": [
{ "match": { "responses.key": "property 1" } },
{ "match": { "responses.text": "yes" } }
]
}
}
}
}
}
This should only return one element which matches both conditions at the same time.
Unfortunatly, it always returns both documents. I assume it's because at some point, ES still flattens the values in the nested objects arrays into something like this (simplified):
resp_id 1236: "key":["gender", "property 1", "property 2"], "text:["abc", "yes", "yes"]
resp_id 1237: "key":["gender", "property 1", "property 2"], "text:["abc", "no", "yes"]
which both contain the property1
and yes
.
What is the correct way to solve this so that only documents are returned which contains an element in the objects array which matches both conditions ("key": "property 1" AND "text": "yes"
) at the same time?
Upvotes: 0
Views: 1059
Reputation: 22956
The problem is with your mapping. You have text mapping which uses standard analyser by default.
Standard analyzer creates tokens on whitespaces. So
property 1
will be tokenised as
{
"tokens": [
{
"token": "property",
"start_offset": 0,
"end_offset": 8,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "1",
"start_offset": 9,
"end_offset": 10,
"type": "<NUM>",
"position": 1
}
]
}
Similarly property 2
also.
Hence both the documents are returned.
And when you search for yes
, it matched from second text in the second document. property 1
matches property
analysed token of second key in the document.
To make it work: - use keyword
variation
{
"query": {
"nested": {
"path": "responses",
"query": {
"bool": {
"must": [
{ "match": { "responses.key.keyword": "property 1" } },
{ "match": { "responses.text.keyword": "yes" } }
]
}
}
}
}
}
It would be proper:
{
"query": {
"nested": {
"path": "responses",
"query": {
"bool": {
"must": [
{ "match_phrase": { "responses.key": "property 1" } },//phrase queries
{ "match": { "responses.text": "yes" } }
]
}
}
}
}
}
Upvotes: 2
Reputation: 1739
Have you directly tried the must
query without nested.path
{
"query": {
"bool": {
"must": [
{
"match": {
"responses.key": "property 1"
}
},
{
"match": {
"responses.text": "yes"
}
}
]
}
}
}
Upvotes: 0