Safiron
Safiron

Reputation: 883

Getting Min, Max, Sum with a single parallel for loop

I am trying to get minimum, maximum and sum (for the average) from a large array. I would love to substitute my regular for loop with parallel.for

UInt16 tempMin = (UInt16)(Math.Pow(2,mfvm.cameras[openCamIndex].bitDepth) - 1);
UInt16 tempMax = 0;
UInt64 tempSum = 0;

for (int i = 0; i < acquisition.frameDataShorts.Length; i++)
{
    if (acquisition.frameDataShorts[i] < tempMin)
        tempMin = acquisition.frameDataShorts[i];

    if (acquisition.frameDataShorts[i] > tempMax)
        tempMax = acquisition.frameDataShorts[i];

    tempSum += acquisition.frameDataShorts[i];
}

I know how to solve this using Tasks with cutting the array myself. However I would love to learn how to use parallel.for for this. Since as I understand it, it should be able to do this very elegantly.

I found this tutorial from MSDN for calculating the Sum, however I have no idea how to extend it to do all three things (min, max and sum) in a single passage.

Results: Ok I tried PLINQ solution and I have seen some serious improvements. 3 passes (Min, Max, Sum) are on my i7 (2x4 Cores) 4x times faster then sequential aproach. However I tried the same code on Xeon (2x8 core) and results are completelly different. Parallel (again 3 passes) are actually twice as slow as sequential aproach (which is like 5x faster then on my i7).

In the end I have separated the array myself with Task Factory and I have slightly better results on all computers.

Upvotes: 2

Views: 2956

Answers (3)

VMAtm
VMAtm

Reputation: 28356

Do not reinvent the wheel, Min, Max Sum and similar operations are aggregations. Since .NET v3.5 you have a handy versions of LINQ extension methods which are already providing you the solution:

using System.Linq;

var sequence = Enumerable.Range(0, 10).Select(s => (uint)s).ToList();

Console.WriteLine(sequence.Sum(s => (double)s));
Console.WriteLine(sequence.Max());
Console.WriteLine(sequence.Min());

Though they are declared as the extensions for IEnumerable, they have some internal improvements for IList and Array types, so you should measure how your code will work on that types and on IEnumerable's.

In your case this isn't enough, as you clearly do not want to iterate other one array more than one time, so the magic goes here: PLINQ (a.k.a. Parallel-LINQ). You need to add only one method to aggregate your array in parallel:

var sequence = Enumerable.Range(0, 10000000).Select(s => (uint)s).AsParallel();

Console.WriteLine(sequence.Sum(s => (double)s));
Console.WriteLine(sequence.Max());
Console.WriteLine(sequence.Min());

This option add some overhead for synchronization the items, but it do scale well, providing a similar time either for small and big enumerations. From MSDN:

PLINQ is usually the recommended approach whenever you need to apply the parallel aggregation pattern to .NET applications. Its declarative nature makes it less prone to error than other approaches, and its performance on multicore computers is competitive with them.

Implementing parallel aggregation with PLINQ doesn't require adding locks in your code. Instead, all the synchronization occurs internally, within PLINQ.

However, if you still want to investigate the performance for different types of the operations, you can use the Parallel.For and Parallel.ForaEach methods overloads with some aggregation approach, something like this:

double[] sequence = ...
object lockObject = new object();
double sum = 0.0d;
  
Parallel.ForEach(
    // The values to be aggregated 
    sequence,

    // The local initial partial result
    () => 0.0d,

    // The loop body
    (x, loopState, partialResult) =>
    {
        return Normalize(x) + partialResult;
    },

    // The final step of each local context            
    (localPartialSum) =>
    {
        // Enforce serial access to single, shared result
        lock (lockObject)
        {
            sum += localPartialSum;
        }
    }
);
return sum;

If you need additional partition for your data, you can use a Partitioner for the methods:

var rangePartitioner = Partitioner.Create(0, sequence.Length);
        
Parallel.ForEach(
    // The input intervals
    rangePartitioner, 
    // same code here);

Also Aggregate method can be used for the PLINQ, with some merge logic
(illustration from MSDN again):

enter image description here

Useful links:

Upvotes: 1

JSteward
JSteward

Reputation: 7091

I don't think parallel.for is good fit here but try this out:

public class MyArrayHandler {

    public async Task GetMinMaxSum() {
        var myArray = Enumerable.Range(0, 1000);

        var maxTask = Task.Run(() => myArray.Max());
        var minTask = Task.Run(() => myArray.Min());
        var sumTask = Task.Run(() => myArray.Sum());

        var results = await Task.WhenAll(maxTask,
                                         minTask,
                                         sumTask);
        var max = results[0];
        var min = results[1];
        var sum = results[2];
    }
}

Edit Just for fun due to the comments regarding performance I took a couple measurements. Also, found this Fastest way to find sum.

@10,000,000 values

GetMinMax: 218ms

GetMinMaxAsync: 308ms

public class MinMaxSumTests {

    [Test]
    public async Task GetMinMaxSumAsync() {            
        var myArray = Enumerable.Range(0, 10000000).Select(x => (long)x).ToArray();
        var sw = new Stopwatch();
        sw.Start();

        var maxTask = Task.Run(() => myArray.Max());
        var minTask = Task.Run(() => myArray.Min());
        var sumTask = Task.Run(() => myArray.Sum());

        var results = await Task.WhenAll(maxTask,
                                         minTask,
                                         sumTask);
        var max = results[0];
        var min = results[1];
        var sum = results[2];
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);
    }

    [Test]
    public void GetMinMaxSum() {            
        var myArray = Enumerable.Range(0, 10000000).Select(x => (long)x).ToArray();
        var sw = new Stopwatch();
        sw.Start();

        long tempMin = 0;
        long tempMax = 0;
        long tempSum = 0;

        for (int i = 0; i < myArray.Length; i++) {
            if (myArray[i] < tempMin)
                tempMin = myArray[i];

            if (myArray[i] > tempMax)
                tempMax = myArray[i];

            tempSum += myArray[i];
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);
    }
}

Upvotes: 2

Tomer
Tomer

Reputation: 1704

I assume that the main issue here is that three different variables are have to be remembered each iteration. You can utilize Tuple for this purpose:

var lockObject = new object();
var arr = Enumerable.Range(0, 1000000).ToArray();
long total = 0;
var min = arr[0];
var max = arr[0];

Parallel.For(0, arr.Length,
    () => new Tuple<long, int, int>(0, arr[0], arr[0]),
    (i, loop, temp) => new Tuple<long, int, int>(temp.Item1 + arr[i], Math.Min(temp.Item2, arr[i]),
        Math.Max(temp.Item3, arr[i])),
    x =>
    {
        lock (lockObject)
        {
            total += x.Item1;
            min = Math.Min(min, x.Item2);
            max = Math.Max(max, x.Item3);
        }
    }
);

I must warn you, though, that this implementation runs about 10x slower (on my machine) than the simple for loop approach you demonstrated in your question, so proceed with caution.

Upvotes: 2

Related Questions