Reputation: 66
I have documents indexed in elasticsearch with an array of keywords field. Following is the mapping
{
"alerts": {
"aliases": {},
"mappings": {
"properties": {
"recordTags": {
"type": "keyword"
}
}
}
}
I insert recordTags as arrays. One document has 7 unique recordTags. There is one more document with one recordTags from the first document.
First Document looks like below
{
"_index": "alerts",
"_type": "_doc",
"_id": "9bcb78db-77bc-4ed9-9972-d305f145a06a",
"_version": 30,
"_seq_no": 481,
"_primary_term": 5,
"found": true,
"_source": {
"recordTags": [
"tag1",
"tag2",
"tag3",
"tag4",
"tag5",
"tag6",
"tag7"
],
}
}
The other document looks like below
{
"_index": "alerts",
"_type": "_doc",
"_id": "582d9497-c43b-4081-a6c7-189ede176702",
"_version": 30,
"_seq_no": 481,
"_primary_term": 5,
"found": true,
"_source": {
"recordTags": [
"tag1"
],
}
}
Now when I query for similar records to first document based on recordTags field, it does not bring any results. I use the following query
{
"query": {
"bool": {
"should": [
{
"more_like_this": {
"fields": [
"recordTags"
],
"like": [
{
"_index": "alerts",
"_id": "9bcb78db-77bc-4ed9-9972-d305f145a06a"
}
],
"min_term_freq": 1,
"min_doc_freq": 1,
"max_query_terms": 12
}
}
]
}
}
}
Can someone enlighten me on this. I am not able to figure out the issue.
Upvotes: 0
Views: 262
Reputation: 66
The reason was the parameter minimum_should_match
. The default value for this parameter is 30%
. That means at least 30% of the terms in the original document should match in the target document. If 30%
of the terms count comes out to be float value it takes floor of the value.
Since there are 7 terms in original document it needs at least 30%
i.e. 2.1
i.e. 2
terms to match in a document to qualify for the result. Changing the value of parameter minimum_should_match
worked.
Upvotes: 1