Sunspot/rails configuration for multi-core (for different language docs) Solr 5 in one environment

Question

I create two cores for English and Japanese docs by Solr 5.1, and am wondering how to set up Sunspot/Rails to choose a core depending on locale selection from my rails app.

The default sunspot.yml shows a setting of one core for each production, development, and test environment, but in my case, there are two cores in one environment.

Is it possible to handle multiple cores under one environment by Sunspot?

Using URL, I can query these cores by different languages as below, so still look for a configuration to select core by locale of an user.

server:port/solr/#/EN_core/query?q=text

server:port/solr/#/JP_core/query?q='テキスト'

makio · Accepted Answer

I figure out how to index multilingual documents in a single Solr instance and search the indexed documents by a specified language from sunspot/rails. This method uses different fields instead of cores for different languages, so it is not a direct answer to my question, but a working example to deal with multilingual documents by sunspot/solr/rails.

For example, index/search field is “description” of Entry model. Some entries have descriptions in English and the others have in Japanese. I use the language detection during the index of solr (https://cwiki.apache.org/confluence/display/solr/Detecting+Languages+During+Indexing) and copyField to deal with sunspot's behavior to add “_text” to the searchable fields.

Add empty string fields “descption_en” and “desctipion_jp” to the Entry model by rails migration commands. May sound strange but these empty fields enable sunspot to search the documents either by English or Japanese. The commands may be like below, but it took quite a lot of time for > 10 million records. I should consider other methods here - https://www.onehub.com/blog/2009/09/15/adding-columns-to-large-mysql-tables-quickly/
```
 rails generate migration AddLanguageHolderToEntry description_en:string description_jp:string
 rake db:migrate
```

Add searchable to the Entry model

class Entry < ActiveRecord::Base
   searchable do
      text :description, :description_en, :description_ja
   end
end

Configure solrconfig.xml to enable Solr the language detection during indexing.

Adding the following updateRequestProcessorChain. Using “description_text” in langid.fl instead of “description” because Sunspot adds “_text” to field name.

 
   
     true
     description_text
     en,ja
     true
     language
     en

I also added langid to the requestHandlers of “/update” and "/update/extract" as follows.


 
   langid
 




    true
    ignored_
    true
    links
    ignored_
    langid

Check paths to the libraries

Configure schema.xml

Add fields for “description”. “_text_en” and “_text_jp” are for the outputs from the solr's language detection. “_en_text” and “_jp_text” for indexing/searching by sunspot.

For the detected language.

These copyfields are set for searching.

Need “text_en” and “text_ja” filedtypes in the schema.xml. I omit details configuration for them here, but use standard analyzers.

.....
.....

Make indexing from sunspot
```
bundle exec rake sunspot:reindex
```
Search document – for test.
```
rails console
```

for English documents -

@search =  Entry.search do
   fulltext 'keyword_en' do
     fields(:description_en)
   end
end

for Japanese documents -

@search =  Entry.search do
   fulltext 'キーワード' do
     fields(:description_ja)
   end
end

@search.results

As you see that this is ad-hoc method and welcome any comments on it.

Sunspot/rails configuration for multi-core (for different language docs) Solr 5 in one environment

Answers (1)

Related Questions