Merge 2 different languages into into single SOLR index using suffix - then how to query

I know that there are similar questions about solr and I read them all and some give insights but not a solution for exaclty what I am trying to do.

  1. I have a table events that contains the columns eventid, name, description in English
  2. I have a table esp_events that contains the columns eventID, name, description in Spanish

Right now we only index the english version so I want to add the Spanish version into the solr index also. As the eventid is identical in both tables I don't want to have that included for the indexing portion but obviously we will need it to pull the data from both tables using the same eventid.

So my questions are:

  1. How do I define the data to be indexed (name, name_esp, description, description_esp).
  2. Do I need to define a table that the data is sourced from - if so - how is that done.
  3. How do I tell the php application to request the search be done against the English or Spanish version of the fields being searched upon.

I did not set up the original config for SOLR so I would appreciate you letting me know which files need to be modified to get this all to work. e.g. solr-config.xml and schema.xml - plus any I am not aware of.

I am also open to a completely different solution to the one I outlined - as long as its not too complex.

Thanks.

Upvotes: 0

Views: 170

Answers (1)

MatsLindh
MatsLindh

Reputation: 52792

This is usually implemented by having separate versions of the field in the schema for each language, such as name_en, name_es, description_en, description_es etc. (as you write).

If you're using DIH, you can perform a join in the query (or use a nested entity) to retrieve the fields from the alternative language table as well.

If you know which language you're querying in, you can use qf (query fields) to tell Solr which fields to search. name_es,description_es if the search is in Spanish, name_en,description_en if it's in English.

There are also a feature in more recent versions of Solr (3.5 and up) for explicit Language Detection.

Upvotes: 1

Related Questions