Reputation: 892
Currently we used Django Haystack and Solr, but now we wanted to switch our Search Backend to ElasticSearch due to the easier configuration in Cluster.
On Solr our Text Field was nGram to have a more fuzzy search and not useing an exact match on words. It was configured like this:
<field name="text" type="ngram" indexed="true" stored="true" multiValued="false" />
<fieldType name="ngram" class="solr.TextField" >
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="15" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Now we try to use ElasticSearch and do the same, but we Stuck since we can't configure the Ngram field like we could do in Solr. So it always didn't make correct matches.
What that means. We currently Index our Model IDs since every Model starts with a ID greater than 10000, so when i before searched 10001 i would get the Model with the ID 10001 (even with the NGram) On ElasticSearch i get nothing, it just couldn't find anything. How could I do this.
Upvotes: 0
Views: 1092
Reputation: 752
Try something like this
Mapping:
"ngram":{
"type":"string",
"search_analyzer":"lowercase_analyzer",
"index_analyzer":"nGram_index_analyzer"
}
settings:
"analysis":{
"analyzer":{
"nGram_index_analyzer":{
"tokenizer":"keyword",
"filter":[
"lowercase",
"nGram_filter"
]
},
"lowercase_analyzer":{
"tokenizer":"keyword",
"filter":[
"lowercase"
]
}
},
"filter":{
"nGram_filter":{
"type":"nGram",
"min_gram":3,
"max_gram":15
}
}
}
Upvotes: 2