Arbflow
Arbflow

Reputation: 1

search result mismatch in solr 4.7 for spanish characters

Search is not displaying proper results for Spanish characters like ñ and Ñ in solr 4.7. I have searched in solr help and found that characters are not coming in ASCII range.

How can one map non ASCII chars with ASCII character? Ex.: In solr index we have chars ñ, Ñ [LATIN CAPITAL LETTER N WITH TILDE] or normal n,N What filter/token should be used to search with Normal N or Ñ and both should be mapped?

While character Ń [LATIN CAPITAL LETTER N WITH ACUTE] works as an exception.

Upvotes: 0

Views: 275

Answers (1)

cheffe
cheffe

Reputation: 9500

I tried using the ICUFoldingFilterFactory this works fine with those accents. If this one is tricky to set up, have a look into the SO question Can not use ICUTokenizerFactory in Solr

This analyzer

<fieldType name="spanish" class="solr.TextField">
    <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.ICUFoldingFilterFactory" />
    </analyzer>
</fieldType>

got me these analysis results, the screen-shot is taken from solr-admin

analysis results from solr-admin for Spanish input

Upvotes: 1

Related Questions