Reputation: 849
I'm using Lucene .NET
I've got 2 threads, each one doing indexing of some different content (using a different algorithm, although they might try to index the same document). They are both writing to the same index (using a single IndexWriter instance).
Also, I've got a web application that also needs to write to the index occasionally. (it obviously cannot use that same indexwriter instance)
My problem is , that the web application cannot write to the index while the 2 threads are running their indexing operation, and they always are!!
How do I manage this more efficiently?
Thanks
Upvotes: 0
Views: 1247
Reputation: 99510
If you don't want to use LBushkin's idea of a work queue, the other approach is to use the same IndexWriter
instance in the web application as the background threads are using. You haven't explained where the 2 indexing threads are - if they are in the same process/appdomain as the web application, it should be feasible to use the same instance. If not, then you have to use the equivalent of the work queue as mentioned by LBushkin, or an adapted version of it as follows: Add a third thread to the indexing process whose job is to listen to indexing requests from the web application. You can use e.g. Named Pipes for this (especially easy if you're using .NET 3.5). The web application sends indexing requests to the third thread, which uses the same IndexWriter
as the other existing threads to update the index.
This is essentially the same idea as LBushkin's (the 3rd thread is a work queue consumer) but may involve less development work as you could be doing less additional coding.
Update: Named Pipes can be used between processes on different machines. You just need to be aware of firewall issues which may arise in certain network topologies.
Upvotes: 1
Reputation: 131806
I'm not very familiar with how Lucene.NET supports threading, but based on your description, you may want to create a "work queue" that other threads post work to - and use a single thread to pick up the work from the queue and use an IndexWriter to add it to the index. This way no single thread is ever starved from the opportunity to get its changes added to the index.
I suspect that Lucene has to use internal locks on its full text indexes anyways, so having more than one thread writing to the index is probably not an effective way to scale your code.
Finally, having multiple threads writing to a single mutable object is often a way to introduce subtle and difficult to fix concurrency problems into a codebase. I generally try to avoid having multiple writer - multiple readers, on the other hand can be quite useful.
Upvotes: 2