Reputation: 2195
I have the following definition of type "taggeable":
{
"mappings": {
"taggeable" : {
"_all" : {"enabled" : false},
"properties" : {
"category" : {
"type" : "string"
},
"tags" : {
"type" : "string",
"term_vector" : "yes"
}
}
}
}
}
Also I have this 5 documents:
Document1 (tags: "t1 t2", category: "cat1")
Document2 (tags: "t1" , category: "cat1")
Document3 (tags: "t1 t3", category: "cat1")
Document4 (tags: "t4" , category: "cat1")
Document5 (tags: "t4" , category: "cat2")
The following query:
{
"query": {
"more_like_this" : {
"fields" : ["tags"],
"like" : ["t1", "t2"],
"min_term_freq" : 1,
"min_doc_freq": 1
}
}
}
is returning:
Document1 (tags: "t1 t2", category: "cat1")
Document2 ("t1", category: "cat1")
Document3 ("t1 t3", category: "cat1")
Which is right, but this query:
{
"query": {
"filtered": {
"query": {
"more_like_this" : {
"fields" : ["tags"],
"like" : ["t1", "t2"],
"min_term_freq" : 1,
"min_doc_freq": 1
},
"filter": {
"bool": {
"must": [
{"match": { "category": "cat1"}}
]
}
}
}
} }
is returning:
Document1 (tags: "t1 t2", category: "cat1")
Document4 (tags: "t4" , category: "cat1")
Document2 (tags: "t1" , category: "cat1")
Document3 (tags: "t1 t3", category: "cat1")
This is, Document4 now is also retrieved and its score is similar than Documen1, that is a perfect match, even when Document4 has not any word included in "t1 t2".
Anyone knows what is happening? I'm using Elastic Search 2.4.6
Thanks in advance
Upvotes: 0
Views: 865
Reputation: 33341
This is a great example of why consistent indentation is important. Here, I've modified what you've posted with consistent indentation, and the problem is much more apparent (JSONLint is a handy tool, if you aren't using an editor that helps with this):
{
"query": {
"filtered": {
"query": {
"more_like_this": {
"fields": ["tags"],
"like": ["t1", "t2"],
"min_term_freq": 1,
"min_doc_freq": 1
},
"filter": {
"bool": {
"must": [{
"match": {
"category": "cat1"
}
}]
}
}
}
}
}
Your filter is a child of "query", instead of a child of "filtered".
Really though, you shouldn't use filtered, it is deprecated, see here. You should change that to a bool, as specified there.
Upvotes: 1