Reputation: 1058
Is there a mistake in the Elastic documentation?
Given the following index mapping:
PUT /my_index
{
"settings": {
"analysis": {
"filter": {
"my_synonym_filter": {
"type": "synonym",
"synonyms": [
"usa,united states,u s a,united states of america"
]
}
},
"analyzer": {
"my_synonyms": {
"tokenizer": "standard",
"filter": [
"lowercase",
"my_synonym_filter"
]
}
}
}
}
}
Given this document:
put /my_index/country/1
{
"title" : "The United States is wealthy"
}
In the documentation it states:
These phrases would not match:
The usa is wealthy
The united states of america is wealthy
The U.S.A. is wealthy
However, these phrases would:
United states is wealthy
Usa states of wealthy
The U.S. of wealthy
U.S. is america
However this does not seem to be the case - the phrases that should match aren't matching at all! Here is the query I am running (without synonym expansion at query time as per the documentation):
GET /my_index/country/_search
{
"query" : {
"match_phrase" : {
"title" : {
"query" : "United States is wealthy",
"analyzer": "standard"
}
}
}
}
What am I missing here?
Upvotes: 1
Views: 1290
Reputation: 17461
The example in documentation works for me.
Probably you forgot to set the analyzer for title
field in the mapping.
Example:
1) Create Index
PUT /my_index
{
"settings": {
"analysis": {
"filter": {
"my_synonym_filter": {
"type": "synonym",
"synonyms": [
"usa,united states,u s a,united states of america"
]
}
},
"analyzer": {
"my_synonyms": {
"tokenizer": "standard",
"filter": [
"lowercase",
"my_synonym_filter"
]
}
}
}
}
}
2) Add Mapping
PUT my_index/country/_mapping
{
"properties" : {
"title" : {"type" : "string","analyzer" : "my_synonyms"}
}
}
3) Index Document
PUT /my_index/country/1
{
"title" : "The United States is wealthy"
}
4) Query
GET /my_index/country/_search
{
"query" : {
"match_phrase" : {
"title" : {
"query" : "United States is wealthy",
"analyzer": "standard"
}
}
}
}
5) Response :
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.75942194,
"hits": [
{
"_index": "my_index",
"_type": "country",
"_id": "1",
"_score": 0.75942194,
"_source": {
"title": "The United States is wealthy"
}
}
]
}
}
Upvotes: 1
Reputation: 2797
So close, you missed one thing!
In your query, you should change the analyzer! You have to run your query text against the my_synonym
analyzer to be able to match the synonyms. Currently, you have the query using the standard
analyzer, which simply tokenizes your text as united
, states
,is
,wealthy
, instead of also using all of your synonyms.
Change this:
GET /my_index/country/_search
{
"query" : {
"match_phrase" : {
"title" : {
"query" : "United States is wealthy",
"analyzer": "standard"
}
}
}
}
To this:
GET /my_index/country/_search
{
"query" : {
"match_phrase" : {
"title" : {
"query" : "United States is wealthy",
"analyzer": "my_synonyms"
}
}
}
}
Now, when you query, the text United States
will properly get tokenized to usa
Upvotes: 1