Reputation: 131
I had a look at the .NET Framework Sourcecode and stumbled on the implementation of LINQ-Sum
int Sum(this IEnumerable<int> source)
I saw that it was implemented with a foreach-loop and wondered why the guys at MS don't use a normal for-loop due to performance reasons (later I learnt that there is no longer a performance difference between a for-loop and a foreach-loop - but I didn't know that just until now).
So I copied the MS-implementation in my own project and wrote a little benchmark:
var range = Enumerable.Range(1, 1000);
Stopwatch sw = new Stopwatch();
//Do sth unimportant for warming up
sw.Start();
for(int i = 0; i <= 10000; i++)
{
long z = i + 3;
}
sw.Stop();
//Implementation 1
sw.Reset();
sw.Start();
for (int i = 0; i <= 1000000; i++)
{
long i1 = range.Sum1();
}
sw.Stop();
Console.WriteLine("Sum1: " + sw.ElapsedTicks.ToString());
//Implementation 2
sw.Reset();
sw.Start();
for (int i = 0; i <= 1000000; i++)
{
long i2 = range.Sum2();
}
sw.Stop();
Console.WriteLine("Sum2: " + sw.ElapsedTicks.ToString());
And here are the two implementations of Sum (Note: both are identical, I first wanted to check if the measuring is working correctly):
public static class LinqExtension
{
public static int Sum1(this IEnumerable<int> source)
{
int sum = 0;
checked
{
foreach (int v in source) sum += v;
}
return sum;
}
public static int Sum2(this IEnumerable<int> source)
{
int sum = 0;
checked
{
foreach (int v in source) sum += v;
}
return sum;
}
}
Surprisingly I got two different results : Sum1 = 16043441 vs. Sum2 = 17480907
So I extended the benchmark a little bit and called Sum1 and Sum2 not just once, but multiple times in the following order:
Hence Sum1 is always nearly 10% faster than Sum2. When I call Sum2 first, the result are contrary.
What causes these performance differences? Why is the first called method faster than the second one? Is my benchmark invalid?
I'm using Visual Studio 2015 CTP4 and .NET Framework 4.5.3
EDIT:
Results in milliseconds instead of ticks
Thanks to the comments, I fixed some mistakes and now the code looks like that:
sw.Start();
for (int i = 0; i <= 1000000; i++)
{
i1 = range.Sum1();
}
sw.Stop();
Console.WriteLine("Sum1: " + sw.ElapsedMilliseconds.ToString() + "\n" + i1.ToString());
Now the results are totally different:
But there is still a difference, yet now the other way around.
Another update:
When I use
int[] range = new int[1000];
for (int m = 0; m < range.Length; m++)
range[m] = m+1;
instead of
var range = Enumerable.Range(1, 1000);
both methods are equally fast.
Update: Tested it with Mono(SharpDevelop) and VS2013 and I got perfectly consistent results. So I think using VS2015 wasn't a great idea, since it's still a beta. Therefore the significance of the results is pretty low.
Another Update:
stakx commented:
Try calling each of your
Sum1
andSum2
methods at least once before you start measuring time, in order to make sure that the methods' code has been generated by the JIT. Otherwise you might be including the time required for JIT code generation in your benchmarking
So I called Sum1
and Sum2
one time before the measurings and suprisingly this solves the problem. But I don't understand why. I understand that generating the code by the JIT costs some time, but only the first time. In my test I have 20 for-loops, each of them calling Sum1 respectively Sum2 1.000.000 times. I do a measuring for every loop, and get constantly different values for Sum1 and Sum2.
It would make sense, if the very first loop is slower, but that's not the case.
I've used ngen.exe to generate a native image and got the following results:
So there is still this difference.
Very important: It is NOT always the first method which is faster! Sometimes it's the first called method, sometimes the second one. But once the assembly was built the results are reproducible. It's pretty confusing for me and I can't see any pattern, when this happens.
Enigmativity:
Did you ever try swapping the order in which you called the methods? Calling
Sum2
first?
Yeah, but then the result are just inverse. If Sum1
was the "fast method", after swapping, Sum2
is the fast one and Sum1
is the slow one.
Upvotes: 1
Views: 290
Reputation: 18665
I modified your test code a little bit and I found that there are two factors that have an impact on the performance: whether you run over the same or two different collections and the type of the enumeration (I don't know why yet).
Enumerating a List<int>
seems to be the slowest case. An array int[]
is the fastest one and there is actually no difference between the two when you use two different ranges (there is however always a difference when lists are used):
static void Main(string[] args)
{
// Try with .ToList() and .ToArray()
var range1 = Enumerable.Range(1, 1000);
var range2 = Enumerable.Range(1, 1000);
int numberOfSums = 100000;
int numberOfTests = 3;
for (int i = 0; i < numberOfTests; i++)
{
SumBenchmark(range1, LinqExtension.Sum1, numberOfSums, "Sum1");
}
for (int i = 0; i < numberOfTests; i++)
{
// Also try with range1
SumBenchmark(range2, LinqExtension.Sum2, numberOfSums, "Sum2");
}
Console.ReadKey();
}
static void SumBenchmark(IEnumerable<int> numbers, Func<IEnumerable<int>, int> sum, int numberOfSums, string name)
{
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < numberOfSums; i++)
{
long result = sum(numbers);
}
sw.Stop();
Console.WriteLine("{2}: {0} ticks in {1} ms ", sw.ElapsedTicks.ToString(), sw.ElapsedMilliseconds.ToString(), name);
}
For me on the cotrary the second call is always faster.
EDIT: there is a huge performace hit if you disable the Prefer 32-bit
option in the build settings and compile it as 64-bit
- then the original Sum
runs much faster:
...however it runs at the same speed without .ToList()
and .ToArray()
EDIT-2: here's one more result where the Sum2
uses an int[]
not an IEnumerable
:
Sum1: in 878 ms
Sum1: in 863 ms
Sum1: in 875 ms
Sum2: in 122 ms
Sum2: in 122 ms
Sum2: in 121 ms
Linq: in 830 ms
Linq: in 825 ms
Linq: in 836 ms
The generated IL
is also different:
for
public static int Sum2(this int[] source)
it's
.method public hidebysig static int32 Sum2(int32[] source) cil managed
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() = ( 01 00 00 00 )
// Code size 28 (0x1c)
.maxstack 2
.locals init ([0] int32 sum,
[1] int32 v,
[2] int32[] CS$6$0000,
[3] int32 CS$7$0001)
IL_0000: ldc.i4.0
IL_0001: stloc.0
IL_0002: ldarg.0
IL_0003: stloc.2
IL_0004: ldc.i4.0
IL_0005: stloc.3
IL_0006: br.s IL_0014
IL_0008: ldloc.2
IL_0009: ldloc.3
IL_000a: ldelem.i4
IL_000b: stloc.1
IL_000c: ldloc.0
IL_000d: ldloc.1
IL_000e: add.ovf
IL_000f: stloc.0
IL_0010: ldloc.3
IL_0011: ldc.i4.1
IL_0012: add
IL_0013: stloc.3
IL_0014: ldloc.3
IL_0015: ldloc.2
IL_0016: ldlen
IL_0017: conv.i4
IL_0018: blt.s IL_0008
IL_001a: ldloc.0
IL_001b: ret
} // end of method LinqExtension::Sum2
and for
public static int Sum1(this IEnumerable<int> source)
it's
.method public hidebysig static int32 Sum1(class [mscorlib]System.Collections.Generic.IEnumerable`1<int32> source) cil managed
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() = ( 01 00 00 00 )
// Code size 44 (0x2c)
.maxstack 2
.locals init ([0] int32 sum,
[1] int32 v,
[2] class [mscorlib]System.Collections.Generic.IEnumerator`1<int32> CS$5$0000)
IL_0000: ldc.i4.0
IL_0001: stloc.0
IL_0002: ldarg.0
IL_0003: callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()
IL_0008: stloc.2
.try
{
IL_0009: br.s IL_0016
IL_000b: ldloc.2
IL_000c: callvirt instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>::get_Current()
IL_0011: stloc.1
IL_0012: ldloc.0
IL_0013: ldloc.1
IL_0014: add.ovf
IL_0015: stloc.0
IL_0016: ldloc.2
IL_0017: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
IL_001c: brtrue.s IL_000b
IL_001e: leave.s IL_002a
} // end .try
finally
{
IL_0020: ldloc.2
IL_0021: brfalse.s IL_0029
IL_0023: ldloc.2
IL_0024: callvirt instance void [mscorlib]System.IDisposable::Dispose()
IL_0029: endfinally
} // end handler
IL_002a: ldloc.0
IL_002b: ret
} // end of method LinqExtension::Sum1
after all iterating an IEnumerable
argument is not too fast... and the optimization takes place only if the collection of the foreach
loop is not an IEnumerable
. DEMO
Upvotes: 1