Thierry
Thierry

Reputation: 6458

BlockingCollection with Parallel.For hangs?

I'm playing around with BlockingCollection to try to understand them better, but I'm struggling to understand why my code hangs when it finishes processing all my items when I use a Parallel.For

I'm just adding a number to it (producer?):

var blockingCollection = new BlockingCollection<long>();

Task.Factory.StartNew(() =>
{
    while (count <= 10000)
    {
        blockingCollection.Add(count);
        count++;
    }
});

Then I'm trying to process (Consumer?):

Parallel.For(0, 5, x => 
{
    foreach (long value in blockingCollection.GetConsumingEnumerable())
    {
        total[x] += 1;
        Console.WriteLine("Worker {0}: {1}", x, value);
    }
});

But when it completes processing all the numbers, it just hangs there? What am I doing wrong?

Also, when I set my Parallel.For to 5, does it mean it's processing the data on 5 separate thread?

Upvotes: 4

Views: 1736

Answers (3)

svick
svick

Reputation: 244777

As its name implies, operations on BlockingCollection<T> block when they can't do anything, and this includes GetConsumingEnumerable().

The reason for this is that the collection can't tell if your producer is already done, or just busy producing the next item.

What you need to do is to notify the collection that you're done adding items to it by calling CompleteAdding(). For example:

while (count <= 10000)
{
    blockingCollection.Add(count);
    count++;
}

blockingCollection.CompleteAdding();

Upvotes: 6

shay__
shay__

Reputation: 3990

Also, when I set my Parallel.For to 5, does it mean it's processing the data on 5 separate thread?

No, quoting from a previous answer in SO(How many threads Parallel.For(Foreach) will create? Default MaxDegreeOfParallelism?):

The default scheduler for Task Parallel Library and PLINQ uses the .NET Framework ThreadPool to queue and execute work. In the .NET Framework 4, the ThreadPool uses the information that is provided by the System.Threading.Tasks.Task type to efficiently support the fine-grained parallelism (short-lived units of work) that parallel tasks and queries often represent.

Put it simply, TPL creates Tasks, not threads. The framework decides how many threads should handle them.

Upvotes: 1

Kote
Kote

Reputation: 2256

It's a GetConsumingEnumerable method feature.

Enumerating the collection in this way blocks the consumer thread if no items are available or if the collection is empty.

You can read more about it here

Also using Parallel.For(0,5) doesn't guarantee that the data will be processed in 5 separate threads. It depends on Environment.ProcessorCount.

Upvotes: 3

Related Questions