Aleks Slade
Aleks Slade

Reputation: 221

What does the Parallel.Foreach do behind the scenes?

So I just cant grasp the concept here. I have a Method that uses the Parallel class with the Foreach method. But the thing I dont understand is, does it create new threads so it can run the function faster?

Let's take this as an example. I do a normal foreach loop.

private static void DoSimpleWork()
        {
            foreach (var item in collection)
            {
                //DoWork();
            }
        }

What that will do is, it will take the first item in the list, assign the method DoWork(); to it and wait until it finishes. Simple, plain and works.

Now.. There are three cases I am curious about If I do this.

Parallel.ForEach(stringList, simpleString =>
            {
                DoMagic(simpleString);
            });

Will that split up the Foreach into let's say 4 chunks? So what I think is happening is that it takes the first 4 lines in the list, assigns each string to each "thread" (assuming parallel creates 4 virtual threads) does the work and then starts with the next 4 in that list? If that is wrong please correct me I really want to understand how this works.

And then we have this. Which essentially is the same but with a new parameter

Parallel.ForEach(stringList, new ParallelOptions() { MaxDegreeOfParallelism = 32 }, simpleString =>
            {
                DoMagic(simpleString);
            });

What I am curious about is this

new ParallelOptions() { MaxDegreeOfParallelism = 32 }

Does that mean it will take the first 32 strings from that list (if there even is that many in the list) and then do the same thing as I was talking about above?

And for the last one.

Task.Factory.StartNew(() =>
            {
                Parallel.ForEach(stringList, simpleString =>
                {
                    DoMagic(simpleString);
                });
            });

Would that create a new task, assigning each "chunk" to it's own task?

Upvotes: 6

Views: 4716

Answers (3)

Tony Thomas
Tony Thomas

Reputation: 436

does the work and then starts with the next 4 in that list?

This depends on your machine's hardware and how busy the machine's cores are with other processes/apps your CPU is working on

Does that mean it will take the first 32 strings from that list (if there even if that many in the list) and then do the same thing as I was talking about above?

No, there's is no guarantee that it will take first 32, could be less. It will vary each time you execute the same code

Task.Factory.StartNew creates a new tasks but it will not create a new one for each chunk as you expect.

Putting a Parallel.ForEach inside a new Task will not help you further reduce the time taken for the parallel tasks themselves.

Upvotes: 0

Alander
Alander

Reputation: 813

Parallel.ForEach perform the equivalent of a C# foreach loop, but with each iteration executing in parallel instead of sequentially. There is no sequencing, it depends on whether the OS can find an available thread, if there is it will execute

MaxDegreeOfParallelism 

By default, For and ForEach will utilize as many threads as the OS provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used by the application.

You do not need to modify this parameter in general but may choose to change it in advanced scenarios:

  1. When you know that a particular algorithm you're using won't scale beyond a certain number of cores. You can set the property to avoid wasting cycles on additional cores.

  2. When you're running multiple algorithms concurrently and want to manually define how much of the system each algorithm can utilize.

  3. When the thread pool's heuristics is unable to determine the right number of threads to use and could end up injecting too many threads. e.g. in long-running loop body iterations, the thread pool might not be able to tell the difference between reasonable progress or livelock or deadlock, and might not be able to reclaim threads that were added to improve performance. You can set the property to ensure that you don't use more than a reasonable number of threads.

Task.StartNew is usually used when you require fine-grained control for a long-running, compute-bound task, and like what @Сергей Боголюбов mentioned, do not mix them up

It creates a new task, and that task will create threads asynchronously to run the for loop

You may find this ebook useful: http://www.albahari.com/threading/#_Introduction

Upvotes: 2

Zazaeil
Zazaeil

Reputation: 4119

Do not mix async code with parallel. Task is for async operations - querying a DB, reading file, awaiting some comparatively-computation-cheap operation such that your UI won't be blocked and unresponsive.

Parallel is different. That's designed for 1) multi-core systems and 2) computational-intensive operations. I won't go in details how it works, that kind of info could be found in an MS documentation. Long story short, Parallel.For most probably will make it's own decision on what exactly when and how to run. It might disobey you parameters, i.e. MaxDegreeOfParallelism or somewhat else. The whole idea is to provide the best possible parallezation, thus complete your operation as fast as possible.

Upvotes: 2

Related Questions