Reputation: 1455
We have pretty interesting effect in a small test app, which we cannot explain.
We have the code block:
while (true)
{
for (int i = 0; i < 1920; i++)
{
for (int j = 0; j < 1080; j++)
{
//l += rand.Next(j);
l += matcher.Next(j);
}
}
k++;
Console.WriteLine("{0}: Iteration {1}, l: {2}", DateTime.Now.ToString("hh:mm:ss"), k, l);
l = 0;
handle.WaitOne(100);
}
And the class Matcher does only a single thing on its Next(j)
call. It returns its internally created Random object's Next(j)
method (so, we're adding a simple function call).
Here's the Matcher class definition:
class Matcher
{
private Random rand = new Random();
internal int Next(int j)
{
return rand.Next(j);
}
}
When we execute this code a single iteration takes around 6 seconds on Intel Core 2 Quad.
However, if we comment the line l += matcher.Next(j);
and will uncomment the line l += rand.Next(j);
the single iteration starts taking around the second.
Does anyone have any ideas why that happens?
Upvotes: 0
Views: 257
Reputation: 22260
The code looks OK. Be sure to run this in release mode, as that can make a big difference. Also, I've even seen the platform target setting have a major effect on certain low-level operations. So try both x86 and x64.
I've got it set up to run in release mode for "Any CPU" with "Optimize Code" on, and it's running in 39 ms.
Upvotes: 0
Reputation: 1455
Community, thank you very much for all your help. The problem was: I was running that from the VS2010 with enabled IntelliTrace: Events & Calls, not Events Only. That was a simple explanation.
Upvotes: 0
Reputation: 26446
Of course the extra call takes some time (the short answer :P). Why it is 6 times as much is hard to say. When looking at performance, you'd be better off looking at the generated IL (use ILSpy, for instance) as the compiler may optimize in one situation and may not be able to optimize in the other.
In the end one line of C# may compile to different IL instructions, or even be translated to different native code (so run faster on your machine but slower on another). In some cases the garbage collector may even be the cause of the difference.
Upvotes: 1
Reputation: 727047
It's not just a simple function call, it's also a dereference of the internal rand object that take place in there. JIT compiler can optimize your loop a lot when you call next random directly, but not when it's inside another class.
The function is called more than 2,000,000 times, and it takes three microseconds to evaluate. It's not the fastest function in the world, but it's not overly slow either: it's the multiplication effect that kills the performance.
If matcher is not locking, or if it's OK to create multiple copies of it, you can speed it up by parallelizing your algorithm, say, eight-way for your Core2-Quad. The code should finish in under a second.
Upvotes: 3