Reputation: 23
I am fairly new to parallel programming; I am developing a program that deals with numbers in different bases and I want to parallelize the addition method.
Firstly, it computes the linear convolution and does the carrying later, thus every iteration of the for-loop is independent (Example):
(Little Endianness btw) [2,7,1] + [8,7,5] = [10,14,6] => [0,5,7]
The question is, can I, if there are threads available, make the addition process faster by having the iterations done at the same time in different threads, and how?
Upvotes: 0
Views: 840
Reputation: 43495
When you have granular work to do, like adding two integers, and you use the Parallel.For
method, you'll find that the synchronization overhead, as well as the overhead of invoking a non-inlinable lambda for each index, negates any performance gained by the paralellization. In this case it's a good idea to chunkify the workload by operating on ranges of indices, instead of one index at a time. Here is how you can use the Parallel.ForEach
+Partitioner.Create
methods for this purpose:
int[] left = new int[1_000_000];
int[] right = new int[1_000_000];
int[] sum = new int[1_000_000];
ParallelOptions parallelOptions = new()
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Partitioner<Tuple<int, int>> partitioner = Partitioner.Create(0, left.Length);
Parallel.ForEach(partitioner, parallelOptions, (Tuple<int, int> range) =>
{
for (int i = range.Item1; i < range.Item2; i++)
{
sum[i] = left[i] + right[i];
}
});
The Partitioner.Create
creates by default about three times as many ranges as the Environment.ProcessorCount
value, so on a quad core machine it will create about 12 ranges in total. This is a good compromise between too many ranges (overhead) and too few ranges (unbalanced workload). If you want you can finetune the number of the partitions, by configuring the third optional rangeSize
argument.
Upvotes: 1
Reputation: 186678
In case of huge arrays (threading has its own overhead), you can try Parallel
, e.g. Parallel.For:
int[] left = ...
int[] right = ...
int[] result = new int[left.Length];
...
Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);
Let's have a look on the effect:
int N = 100_000_000;
int[] left = new int[N];
int[] right = new int[N];
int[] result = new int[left.Length];
// To prevent garbage collection while testing
GC.Collect(2);
Stopwatch sw = new Stopwatch();
sw.Start();
// Parallel version
//Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);
// Standard for loop version
for (int i = left.Length - 1; i >= 0; --i)
result[i] = left[i] + right[i];
sw.Stop();
Console.Write(sw.ElapsedMilliseconds);
Outcome (.Net 6 IA-64, Realease build, Core i9, 6 cores)
200 - parallel version
500 - for loop version
Upvotes: 1