Reputation: 276
I've been coding in C++ for years, and I've used threads in the past, but I'm just now starting to learn about multithreaded programming and how it actually works.
So far I'm doing okay with understanding the concepts, but one thing has me stumped.
I can't find anything online that explains it well enough for me to understand.
I code in C++, but I'm sure this question can apply to many different programming languages.
Upvotes: 6
Views: 10360
Reputation: 57688
What are parallel for loops, and how do they work?
A parallel for loop is a for
loop in which the statements in the loop can be run in parallel: on separate cores, processors or threads.
Let us take a summing code:
unsigned int numbers[] = { 1, 2, 3, 4, 5, 6};
unsigned int sum = 0;
const unsigned int quantity = sizeof(numbers) / sizeof (numbers[0]);
for (unsigned int i = 0; i < quantity; ++i)
{
sum = sum + numbers[i];
};
Calculating a sum does not depend on the order. The sum only cares that all numbers have been added.
The loop could be split into two loops that are executed by separate threads or processors:
// Even loop:
unsigned int even_sum = 0;
for (unsigned int e = 0; e < quantity; e += 2)
{
even_sum += numbers[e];
}
// Odd summation loop:
unsigned int odd_sum = 0;
for (unsigned int odd = 1; odd < quantity; odd += 2)
{
odd_sum += numbers[odd];
}
// Create the sum
sum = even_sum + odd_sum;
The even and odd summing loops are independent of each other. They do not access any of the same memory locations.
The summing for
loop can be considered as a parallel for loop because its statements can be run by separate processes in parallel, such as separate CPU cores.
Somebody else can supply a more detailed definition, but this is the general example.
Edit 1:
Can any for loop be made parallel?
No, not any loop can be made parallel. Iterations of the loop must be independent from each other. That is, one cpu core should be able to run one iteration without any side effects to another cpu core running a different iteration.
What are the use for them?
Performance?
In general, the reason is for performance. However, the overhead of setting up the loop must be less than the execution time of the iteration. Also, there is overhead of waiting for the parallel execution to finish and join the results together.
Usually data moving and matrix operations are good candidates for parallelism. For example, moving a bitmap or applying a transformation to the bitmap. Huge quantities of data need all the help they can get.
Other functionality?
Yes, there are other possible uses of parallel for loops, such as updating more than one hardware device at the same time. However, the general case is for improving data processing performance.
Upvotes: 9