Gene Golovchinsky
Gene Golovchinsky

Reputation: 6131

How to make a concurrent iterator over some source?

I would like to have an iterator that can be read by multiple threads concurrently so that I can process the data of the iterator's source in parallel. The challenge is that I can't really couple hasNext() with its logical next() as those could go to different threads. (That is, two threads can call hasNext(), each see true, and then have the second thread fail because there was only one item.) My problem is that for some sources I don't really know if it has a next element until I try to read it. One such example is reading lines from a file; another is reading Term instances from a Lucene index.

I was thinking of setting up a queue inside the iterator and feeding the queue with a separate thread. That way, hasNext() is implemented in terms of the queue size. But I don't see how I could guarantee that the queue is filled because that thread could get starved.

Should I ignore the Iterator contract, and just call next() exhaustively until a NoSuchElementException is thrown?

Is there a more elegant way of handling the problem?

Upvotes: 4

Views: 8190

Answers (5)

maxauthority
maxauthority

Reputation: 141

As an up-to-date answer I think you should use a ConcurrentLinkedQueue which is available since Java 1.5.

Upvotes: 0

alexantd
alexantd

Reputation: 3606

The chosen answer will work, but it introduces complexity and potential unnecessary buffering. Why not ignore Iterator’s contract and write your own:

public interface ConcurrentIterator<T> {

    T next() throws EndOfIterationException;

}

This will be thread-safe if your implementation is. Can even wrap an Iterator in it.

Upvotes: 0

elekwent
elekwent

Reputation: 773

I could be missing the point, but couldn't you use a synchronized block in this situation?

synchronized(iterator)
{
    if (iterator.hasNext()) element = iterator.next();
}

Here, when one thread is using the iterator, no other threads will be able to access it.

Upvotes: -1

Andreas Dolk
Andreas Dolk

Reputation: 114847

One workaround / escape comes to my mind, to keep (most of) the contract and avoid NoSuchElementExceptions: The iterator.next() could return a custom "End-of-iteration" marker object, that can be processed but is nothing but a dummy. So if one thread receives a true for hasNext() but another thread already grabbed the last item, then the first thread will get a dummy (instead of an exception).

You should be able to use this kind of iterator in all normal use cases and single threaded uses should even notice the difference. Should be useable with the enhanced for loop too.

It will only fail if one tries to wait for NoSuchElementException instead of checking hasNext(), because that exception will not be thrown because of the dummy items.

Upvotes: 1

sbridges
sbridges

Reputation: 25150

Can your threads just pull from a BlockingQueue instead of an Iterator. As you have discovered, Iterators are not well suited for concurrent access.

Pass a LinkedBlockingQueue, and have your threads do queue.poll() until nothing is left.

Upvotes: 7

Related Questions