jeffery.yuan
jeffery.yuan

Reputation: 1255

How to import a part of index from main Solr server(based on a query) to another Solr server and then do incremental import later(the updated index)?

I have a main solr server(solr1) which stores indexes of all doc, and want to implement the follow function:
1. First make a full import of my doc updated/created recently(last 1 or 2 weeks) from solr1.
2. Make delta import at intervals to copy the change of my doc from solr1 to solr2. - doc may be deleted, updated, created during this period.

-- as the function supported by SqlEntityProcessor to import data from DB to Solr.

http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor
SolrEntityProcessor can make a full-import from one Solr to another solr based on a query(using query parameter in config file), but seems can't do delta import later: no deltaImportQuery and deltaQuery configuration, which is supported in SqlEntityProcessor.

How can I implement this function?
Thanks for any help and reply :)

Upvotes: 1

Views: 1156

Answers (1)

Paige Cook
Paige Cook

Reputation: 22555

You should be able to implement the equivalent of the deltaImportQuery by using a Solr field and filter query that can simulate this.

I would suggest the addition of a timestamp field to your Solr schema. Below is the field definition from the solr example schema.

   <field name="timestamp" type="date" indexed="true" stored="true" 
       default="NOW" multiValued="false"/>

This will create a field that records the last time a record was added to the Solr index and you can add a filter query fq option to your SolrEntityProcessor query to limit the results to only those items where the timestamp field value is greater than or equal to the last time your import handler was last run using DateMath in the filter query.

Here are some good references for how DateMath works.

Upvotes: 1

Related Questions