Reputation: 3
I'm using Solr for some time. It's working fine with English language. Now files with japanese also included. Here the major problem arises.
When I'm searching for Japanese its giving inappropriate results. I tried to use kurumoji but I don't how to configure this. I am unable to find a solution for both Japanese and English both at same time.
Upvotes: 0
Views: 233
Reputation: 52792
As you do not know the language before indexing, you probably want to look into using Solr's Language Detection in an update processor. This will attempt to detect which language the content is in, and then index the content to fields postfixed with the language code (see langid.map
). That way you can have separate analyze and filter sequences for each language, using japanese language features for the field which receives the japanese content, and english content features (stemming/etc.) for the english field.
If you want to search both fields when querying, use qf
(if using the (e)dismax query parser) to find documents matching in any of the fields.
Upvotes: 1