Reputation: 653
I am trying to update my original index settings. My initial setting looks like this:
client.create(index = "movies", body= {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"filter": {
"my_custom_stop_words": {
"type": "stop",
"stopwords": stop_words
}
},
"analyzer": {
"my_custom_analyzer": {
"filter": [
"lowercase",
"my_custom_stop_words"
],
"type": "custom",
"tokenizer": "standard"
}
}
}
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_custom_analyzer",
"search_analyzer": "my_custom_analyzer",
"search_quote_analyzer": "my_custom_analyzer"
}
}
}
},
ignore=400
)
And I am trying to add the synonym filter to my existing analyzer (my_custom_analyzer) using client.put_settings:
client.put_settings(index='movies', body={
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"filter": [
"lowercase",
"my_stops",
"my_synonyms"
],
"type": "custom",
"tokenizer": "standard"
}
},
"filter": {
"my_custom_stops": {
"type": "stop",
"stopwords": stop_words
},
"my_custom_synonyms": {
"ignore_case": "true",
"type": "synonym",
"synonyms": ["Harry Potter, HP => HP", "Terminator, TM => TM"]
}
}
}
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_custom_analyzer",
"search_analyzer": "my_custom_analyzer",
"search_quote_analyzer": "my_custom_analyzer"
}
}
}
},
ignore=400
)
However, when I issue a search query (searching for "HP") that queries the movies index and I'm trying to rank the documents so that the document containing "Harry Potter" 5 times is the top element in the list. Right now, it seems like the document with "HP" 3 times tops the list, so the synonyms filter isn't working. I've closed movies index before I do client.put_settings and then re-opened the index. Any help would be greatly appreciated!
Upvotes: 0
Views: 1307
Reputation: 2993
You should re-index all your data in order to apply the updated settings on all your data and fields.
The data that had already been indexed won't be affected by the updated analyzer, only documents that has been indexed after you updated the settings will be affected.
Not re-indexing your data might produce incorrect results since your old data is analyzed with the old custom analyzer and not with the new one.
The most efficient way to resolve this issue is to create a new index, and move your data from the old one to the new one with the updated settings.
Follow these steps:
POST _reindex
{
"source": {
"index": "movies"
},
"dest": {
"index": "new_movies"
}
}
DELETE movies
PUT movies
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"filter": [
"lowercase",
"my_custom_stops",
"my_custom_synonyms"
],
"type": "custom",
"tokenizer": "standard"
}
},
"filter": {
"my_custom_stops": {
"type": "stop",
"stopwords": "stop_words"
},
"my_custom_synonyms": {
"ignore_case": "true",
"type": "synonym",
"synonyms": [
"Harry Potter, HP => HP",
"Terminator, TM => TM"
]
}
}
}
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_custom_analyzer",
"search_analyzer": "my_custom_analyzer",
"search_quote_analyzer": "my_custom_analyzer"
}
}
}
}
POST _reindex?wait_for_completion=false
{
"source": {
"index": "new_movies"
},
"dest": {
"index": "movies"
}
}
After you've verified all your data is in place you can delete new_movies
index. DELETE new_movies
Hope these help
Upvotes: 3