Reputation: 441
I have a MySql database with three tables: Persons, addresses and contact informations. The person table is the main table with 1-to-many relations to the other two tables.
As far as I've read, the main question when structuring data in Solr is this: Which questions do I need to get answered? I want to be able to search persons by addresses and by contact informations.
My question is this:
What is best practice in an example like this, when I want to make data searchable with Solr, and how would a MySql query look like? I see the following solutions:
Create one giant query, joining and concatting all informations at once in one field
Create different indexes (cores?) for every type of information, and join them by primary key
Get data from the database one join at a time, with repeated use of the person primary key in every type of data row (persons, persons addresses by person id and person contact informations by person id)
Get person table data, and put related data in a nested document (even though I don't know how this should be done)
Upvotes: 0
Views: 495
Reputation: 52892
As long as finding "persons" is the goal, you'll want to index each person as a separate index in Solr (in a single index). You do not want to "join" between cores (as Solr really has no join support, except for a few features that are rather limited when it comes to scalability - and they're not the same as you'd consider a join in the RDBMS sense).
Create a DataImportHandler configuration that does what you want, by retrieving all the persons, then adding multi valued field for each address and phone number. You can do this by joining the tables together and using GROUP_CONCAT to get several values in a single field before splitting them in your configuration - or just use sub entities with decent caching (this will require more queries, as each sub entity will be retrieved by a separate query). Depending on the amount of work, both will work in different ways.
If you're indexing large amounts of data that you join from several locations, you might want to create a small script that streams each row from the MySQL server and then adds a document to Solr for each row, allowing you to optimize the process even more.
Upvotes: 1
Reputation: 696
Constructing a giant string will be okay in initial stage, but on a later point of time you will have to switch mostly to the option two mentioned below due to the noise generated from the the giant string.
You should prepare a SOLR document with all the possible fields to be searched for the people.
So certain fields like address should be multi valued field so that you can store more than one addresses for the SOLR doc
<add>
<doc>
<field name="id">Person 1</field>
<field name="address">addr1</field>
<field name="address">addr2</field>
</doc>
</add>
Once you have the doc in place, requirement is to search across all the fields. This can be achieved in two ways
Define a copy field say all_text which copies values from all the fields of person say address, name etc and use this as the default in Search handler as below
<requestHandler name="/query" class="solr.SearchHandler">
<lst name="defaults">
<str name="qf">all_text</str>
</lst>
</requestHandler>
Define request handler default with specif fields. Here you can even achieve the boost for certain fields like name against the address for ranking.
<requestHandler name="/query" class="solr.SearchHandler">
<lst name="defaults">
<str name="qf">name^5 address^2</str>
</lst>
</requestHandler>
Upvotes: 1