Reputation: 983
I have an issue with tags such as social media
, two words
, tag with many spaces
have a multiplied score for each word in search query.
How can I achieve to search two words
as one word instead getting different score when searching two
and two words
Here is a visual representation the current results score:
+-----------------------+-------+
| search | score |
+-----------------------+-------+
| two | 2.76 |
| two words | 5.53 |
| tag with many spaces | 11.05 |
| singleword | 2.76 |
Here is a visual representation of what I want:
+-----------------------+-------+
| search | score |
+-----------------------+-------+
| two | 2.76 |
| two words | 2.76 |
| tag with many spaces | 2.76 |
| singleword | 2.76 |
There are multiple tags in each document. each tag search is broken down by a comma ,
in PHP and outputted like the query below
Assuming a document has multiple tags including two words
and singleword
, this would be the search query:
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"match": {
"tags.name": "two words"
}
},
{
"match": {
"tags.name": "singleword"
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "tags.votes"
}
}
],
"boost_mode": "multiply"
}
}
The score will be different if searching two
instead of two words
Here is how the result looks like when searching two words
{
"_index": "index",
"_type": "type",
"_id": "u10q42cCZsbFNf1W0Tdq",
"_score": 4.708793,
"_source": {
"url": "example.com",
"title": "title of the document",
"description": "some description of the document",
"popularity": 9,
"tags": [
{
"name": "two words",
"votes": 1
},
{
"name": "singleword",
"votes": 1
},
{
"name": "othertag",
"votes": 1
},
{
"name": "random",
"votes": 1
}
]
}
}
Here is the result when searching two
instead of two words
{
"_index": "index",
"_type": "type",
"_id": "u10q42cCZsbFNf1W0Tdq",
"_score": 3.4481666,
"_source": {
"url": "example.com",
"title": "title of the document",
"description": "some description of the document",
"popularity": 9,
"tags": [
{
"name": "two words",
"votes": 1
},
{
"name": "singleword",
"votes": 1
},
{
"name": "othertag",
"votes": 1
},
{
"name": "random",
"votes": 1
}
]
}
}
Here is the mapping (for the tags specifically)
"tags": {
"type": "nested",
"include_in_parent": true,
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"votes": {
"type": "long"
}
}
}
I have tried searching with "\"two words\""
and "*two words*"
but it gave no difference.
Is it possible to achieve this?
Upvotes: 1
Views: 3457
Reputation: 7221
You should use the non analyzed string for your matching and switch to a term query.
Can you try :
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"term": {
"tags.name.keyword": "two words"
}
},
{
"term": {
"tags.name.keyword": "singleword"
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "tags.votes"
}
}
],
"boost_mode": "multiply"
}
}
With your actual implementation, when you do a match
query with the query "two words" it will analyze your query to search for token "two" and "words" in your tags. So documents with tag "two words" will match the two tokens and will be boosted.
Upvotes: 2