Reputation: 848
I am looking for a solution to expanding my current Apache Solr (4.x) such that it can be used to support a large amount of languages. I would like to take a multicore approach, and have set up my solr so that it has an english core as well as a japanese core (for starters). To challenge things, I am given n .xml files which contain the data which solr will use to index. So to be clear:
I have n languages and I have n .xml files (one .xml per language). Each .xml file is identical in terms of markups, only the raw text is different.
My issue is that I can't seem to figure out how to post say the english.xml file strictly to the english core and the japanese.xml file strictly to the japanese core, so that when I visit my page at:
www.example.com/us/index.html, I am looking at the english.xml indexed results, and
www.example.com/jp/index.html gives me the japanese.xml indexed results.
There really only needs to be one schema because the different language .xml files are structured identically tagwise, but I duplicated all of them because each schema file will be optimized for it's respective language.
if (TLDR) {
How would I independently post:
english.xml -> core-english
japanese.xml -> core-japanese
Or what would be a better approach that gives me
facet and search independent groups so that I can localize my pages?
}
Obviously I don't want to have n different instance of solr running.
Upvotes: 0
Views: 234
Reputation: 939
Benjamin, your approach is perfect. Multicore is a great way to do it.
Suppose your server is at IP 10.10.10.10
, and solr is running under port 8983, then your multicore should look something like:
10.10.10.10:8983/solr/us
10.10.10.10:8983/solr/jp
10.10.10.10:8983/solr/fr
...and so on
Couple of things to keep in mind:
POSTING XML
This is how you will post content of various XML files for different countries:
US:
curl http://10.10.10.10:8983/solr/us/update?commit=true -H "Content-Type: text/xml" --data-binary '<add><doc><field name="id">1</field><field name="title">First Item</field></doc><doc><field name="id">2</field><field name="title">Second Item</field></doc></add>'
FR:
curl http://10.10.10.10:8983/solr/fr/update?commit=true -H "Content-Type: text/xml" --data-binary '<add><doc><field name="id">1</field><field name="title">premier article</field></doc><doc><field name="id">2</field><field name="title">deuxième article</field></doc></add>'
JP:
curl http://10.10.10.10:8983/solr/jp/update?commit=true -H "Content-Type: text/xml" --data-binary '<add><doc><field name="id">1</field><field name="title">最初の項目</field></doc><doc><field name="id">2</field><field name="title">番目の項目</field></doc></add>'
SEARCHING
You can search each country independently by just querying its core:
Search query for US:
http://10.10.10.10:8983/solr/us/select?query=john
Search query for JP:
http://10.10.10.10:8983/solr/jp/select?query=ジョン
Upvotes: 1