ebvtrnog
ebvtrnog

Reputation: 4377

C# aggregate in a better time complexity

Let's say we have such an array:

var arr = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

I want to perform the aggregation:

var sum = arr.Aggregate((a, b) => a + b);

Of course, this is just an example to simplify it. I don't deal with int, but more complex objects that need merging (they are trees). However, This aggregation works really bad, because it iterates from left to right, adding two elements that lie to each other. In case of ints this doesn't make any difference, but in case of complex objects a better solution would be to perform the aggregation in a tree way. What does it mean?

                   55
              36         19
      10             26      19
  3       7      11      15      19
1   2   3   4   5   6   7   8   9   10

I hope this schema makes it clear.

How to achieve this in C#'s LINQ?

Upvotes: 1

Views: 1061

Answers (2)

Martin Liversage
Martin Liversage

Reputation: 106826

You can create a LINQ-like extension method that aggregates a sequence "top-down" in a hierarchical fashion like you describe. However, for efficiency this requires random access to the source sequence so building on top on IEnumerable<T> is not the best choice. But you can use IReadOnlyList<T> as an alternative (which off course requires that your source is stored in an array or list).

static class ReadOnlyListExtensions {

  public static T HierarchicalAggregate<T>(this IReadOnlyList<T> source, Func<T, T, T> func) {
    if (source == null)
      throw new ArgumentNullException("source");
    if (func == null)
      throw new ArgumentNullException("func");
    if (source.Count == 0)
      throw new InvalidOperationException("Sequence contains no elements");
    return Recurse(source, 0, source.Count, func);
  }

  static T Recurse<T>(this IReadOnlyList<T> source, Int32 startIndex, Int32 count, Func<T, T, T> func) {
    if (count == 1)
      return source[startIndex];
    var leftCount = count/2;
    var leftAggregate = Recurse(source, startIndex, leftCount, func);
    var rightCount = count - leftCount;
    var rightAggregate = Recurse(source, startIndex + leftCount, rightCount, func);
    return func(leftAggregate, rightAggregate);
  }

}

Note that the division performed by this algorithm is slightly different compared to your example. At the first level the 10 element sequence is divided into two 5 element sequences that are then each is divided into a 2 and a 3 element sequence etc.:

        55
   15   +   40
 3 + 12    13+27
1+2 3+9   6+7 8+19
     4+5       9+10

Upvotes: 3

Nikolay Kostov
Nikolay Kostov

Reputation: 16973

You can use PLINQ (parallel LINQ).

The Parallel Aggregation pattern uses unshared, local variables that are merged at the end of the computation to give the final result. Using unshared, local variables for partial, locally calculated results is how the steps of a loop can become independent of each other.

If you invoke the AsParallel extension method, you're instructing the compiler to bind to PLINQ instead of to LINQ. The program will use the parallel versions of all further query operations within the expression.

In your case the code would be:

var sum = arr.AsParallel().Aggregate((a, b) => a + b);

Here you can find more information: https://msdn.microsoft.com/en-us/library/ff963547.aspx

enter image description here

Upvotes: 5

Related Questions