Reputation: 149
I am new on Lucene.Net and currently doing R&D to use this for .Net applications. As Lucene.Net is a general purpose library and it has nothing to do with data sources like SQL Server, SQLite, etc. It only knows you have a Lucene document that you want indexed. So when we dump data to Lucene.Net from any data source. How can we make Lucene.Net documents up to date as the data is in SQL Database(For example). One way to keep both data, i.e. (Lucene.Net and SQL) sync is to continually update the Lucene index during each database update. We also know that there is a possibility that someone can made manually changes to SQL database, in that scenario how we can update Lucene indexes?
Upvotes: 3
Views: 463
Reputation: 33791
I can provide a conceptual overview of how to do this. Fundamentally you need three things.
There are of course lots of different ways to handle each of these.
The easiest way to handle #1 is if your database supports insert, update and delete triggers. If it does then you can add these three triggers on every table that supplies data to the LuceneNet index and when a record in one of those tables changes the trigger can automatically write a record into the change log that indicate the table, record id and the operation (insert, update, delete). If your database does not support triggers then it's a bit harder. You could hook into some common api that your app uses to talk to the database when doing an insert, update, and delete and have that hook record the same sort of info in a change log.
The change log can take many forms, but the easiest way is probably to just create a table in the sql database. This way the insert, update and delete triggers can record their observations directly by inserting a record into the changeLog table. Having it manifest as a sql database table also works if you are writing to it from an api wrapper.
There are alot of ways to implement this, but probably the most robust is to use a timer to kick off a background thread that checks for the presence of unprocessed changeLog records every so many seconds. If it finds such records, it reads them in, checks whether it's for an insert, update or delete operation and for which table and record ID. If insert or update, it reads the records from the sql database and inserts or updates the rec in LuceneNet. If for a deleted it directly deletes the record in LuceneNet. Then it sets a boolean on the changeLog record to indicate that the record has been processed.
There are more bells and whistles that can be added, but that should give you a pretty clear picture of how to implement a way to keep the LuceneNet index up to date in near real time.
Upvotes: 4