Reputation: 551
I'm looking to search a word say "amend" which may be present in data as "amending", "amendment" or even "*amend". Which is the best method to search words like these? I know wildcard can achieve this but I am restricted to not using it due to my other part of the code. What are the different ways which provides better search performance?
Upvotes: 15
Views: 48026
Reputation: 1286
What you're describing with the terms "amend", "amendment" and "amending" is called keyword stemming. You can add a stemmer token filter to your Elastic index settings.
For Example:
PUT /my_index
{
"settings": {
"analysis" : {
"analyzer" : {
"my_analyzer" : {
"tokenizer" : "standard",
"filter" : ["standard", "lowercase", "my_stemmer"]
}
},
"filter" : {
"my_stemmer" : {
"type" : "stemmer",
"name" : "english"
}
}
}
}
}
Using this stemmer will index the terms [amend, amending, amendment]
as [amend, amend, amend]
.
Then you can do a match
query and it should return what you're wanting.
Upvotes: 1
Reputation: 2277
There is various way:
As you mention you cannot use wildcard, Then go for query_string
{
"query":{
"query_string":{
"default_field":"text",
"query":"*amend"
}
}
}
Second you can use n-gram tokenizer. You can check here https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-ngram-tokenizer.html
It will breaks your value "amending" to words like ["ame","men","end" ... etc]
Once you done with applying n-gram tokeinzer, Start indexing your data.
You can query like below:
{"query":{"term":{"text":"amend"}}}
You will get your output result.
Upvotes: 5
Reputation: 8840
You can implement this using query_string feature of elasticsearch. Assuming that you use default standard analyzer
.
{
"query":{
"query_string":{
"default_field":"Customer",
"query":"*Jo*"
}
}
}
You can add multiple fields as well as shown in the below query
{
"query":{
"query_string":{
"fields":[
"Customer",
"Name"
],
"query":"*Jo*"
}
}
}
Upvotes: 18