Reputation: 131
We use IBM WCS v7 for one of our e-commerce based requirement, in which Apache solr is embeded for the search based implementation.
As per a new requirement, there will be multiple language support for an website, ex- France version of the site can have support for english, french etc. (en_FR, fr_FR etc.) In order to configure solr with this interface, what should be the optimal indexing strategy using a single solr core ?
I got some ideas 1) using multiple fields in schema.xml for multiple languages, 2) using different solr cores for different languages.
But these approaches don't seem to be the best one fitting to the current requirement, as there will be 18 language support for the e-commerce website. Using different fields for every language will be very complicated, and also using different solr code is not a good approach as we need to apply the configurational change in all the solr cores if ever it happens as per any requirement.
Is there any other approaches, or is there any way I can associate the localeId to the indexed data and process the search result with respect to the detected language ?
Any help on this topic will be highly appreciated.
Thanks and Regards,
Jitendriya Dash
Upvotes: 2
Views: 1008
Reputation: 36
This post has already been answered by original poster and others- just summarizing that as an answer:
Recommended solution is to create one index core per locale/language. This is especially important if either the catalog or content (such as product name, description, keywords) will be different and business prefers to manage it separately for each locale. This gives the added benefit for Solr to perform its stemming and tokenization specific to that locale, if applicable.
I have been part of solutions where this approach was preferred over maintaining multiple fields or documents in the same core for each locale/language. Most number of index cores I have worked with is 6.
One must also remember that index core addition will require updates to supporting processes (Product Information Management system updates to catalog load to workspace management to stage-propagation to reindexing to cache invalidation).
Upvotes: 1