Reputation: 1702
We are working on implementing Solr on e-commerce site. The site is continuously updated with a new data, either by updates made in existing product information or add new product altogether.
We are using it on asp.net mvc3 application with solrnet.
We are facing issue with indexing. We are currently doing commit using following:
private static ISolrOperations<ProductSolr> solrWorker;
public void ProductIndex()
{
//Check connection instance invoked or not
if (solrWorker == null)
{
Startup.Init<ProductSolr>("http://localhost:8983/solr/");
solrWorker = ServiceLocator.Current.GetInstance<ISolrOperations<ProductSolr>>();
}
var products = GetProductIdandName();
solrWorker.Add(products);
solrWorker.Commit();
}
Although this is just a simple test application where we have inserted just product name and id into the solr index. Every time it runs, the new products gets updated all at once, and available when we search it. I think this create the new data index into solr everytime it runs? Correct me if I'm wrong.
My Question is:
What is the best strategy to solve this problem?
Upvotes: 1
Views: 5482
Reputation: 52779
When you do an update only that record is delete and inserted. Solr does not update the records. The other records are untouched. When you commit the data new segments would be created with this new data. On optimize the data is optimized into a single segment.
You can use Incremental build technique to add/update records after the last build. DIH provides it out of the box, If you are handling it manually through jobs you can maintain the timestamp and run builds.
Solr does not have an update operation. It will perform a delete and add. So you have to use the complete data again and not just the updated fields. Its not resource intensive. Usually only Commit and Optimize are.
Solr can handle any amount of data. You can use Sharding if your data grows beyond the handling capacity of a single machine.
Upvotes: 4