Matthew Moisen
Matthew Moisen

Reputation: 18289

Real Time Searching of a Lucene Index that is Updated Frequently - Is this practical?

I have a query that involves many joins on many tables and takes too long to query. I've been asked to try to use Lucene to speed things up. What I've done is exported the query to XML, and used Java to parse the XML, Lucene to index the XML, and created a API to query this index in Java. This reduces the query time 6-10 fold.

However, unless a dedicated VM or machine constantly queries the database, exports the data, and reindexes the data, any end user who uses the API to search the Lucene index will be receiving not-up-to-date data. Even if a machine is dedicated for this purpose, the data will not be up to date on every attempt to search the Lucene index.

I believe that "delta import" for Solr is what I am talking about. I think that is unique to Solr though, not Lucene.

What options exist for Lucene to index data that will change with some frequency, and allow users to search/query in real time? Is this too much to ask from Lucene?

Upvotes: 1

Views: 1245

Answers (1)

varunthacker
varunthacker

Reputation: 2204

Solr happens to be a search application build on top of lucene. So any indexing and searching functionality provided comes from lucene.

Lucene supports Near real time search - http://wiki.apache.org/lucene-java/NearRealtimeSearch

For your indexing concerns I would say it depends on your app which syncs data between your database and lucene. Lucene can index at a very high throughput. http://people.apache.org/~mikemccand/lucenebench/indexing.html So your app should be smart enough to figure changes made in the database and re-index only that "delta"

Upvotes: 1

Related Questions