Reputation: 2640
I tried to improve the performance of some piece of code in my project by generating IL specifically for that task.
This task is currently done by performing a for-loop over the elements of an array and running various methods via an interface. I wanted to replace it with IL that specifically performs this task without any virtual/interface calls (by directly performing the needed operations).
For some reason, the run-time performance of this DynamicMethod is much slower than the runtime performance of the original code that does interface calls per element. The only reason I can see is that my DynamicMethod is quite large (a few instructions per element of the array).
I thought it might be the first call that is slow because of JIT, but it is not. All calls are slower. Has anybody encountered something like that?
edit
People here request code.. the original code is quite large, but here is a scaled-down version (it's an automatic differentiation code for computing function gradient using reverse-mode AD). All elements in my array inherit the following class
abstract class Element
{
public double Value
public double Adjoint
public abstract void Accept(IVisitor visitor)
}
I have two classes that derive from element. For simplicity, I will define only the following two
class Sum : Element
{
public int IndexOfLeft; // the index in the array of the first operand
public int IndexOfRight; // the index in the array of the second operand
public abstract void Accept(IVisitor visitor) { visitor.Visit(this); }
}
class Product : Element
{
public int IndexOfLeft; // the index in the array of the first operand
public int IndexOfRight; // the index in the array of second first operand
public abstract void Accept(IVisitor visitor) { visitor.Visit(this); }
}
Here is the implementation of the visitor:
class Visitor : IVisitor
{
private Element[] array;
public Visitor(Element[] array) { this.array = array; }
public void Visit(Product product)
{
var left = array[product.IndexOfLeft].Value;
var right = array[product.IndexOfRight].Value;
// here we update product.Value and product.Adjoint according to some mathematical formulas involving left & right
}
public void Visit(Sum sum)
{
var left = array[sum.IndexOfLeft].Value;
var right = array[sum.IndexOfRight].Value;
// here we update sum.Value and product.Adjoint according to some mathematical formulas involving left & right
}
}
My original code looks like this:
void Compute(Element[] array)
{
var visitor = new Visitor(array);
for(int i = 0; i < array.Length; ++i)
array[i].Accept(visitor);
}
My new code attempts to do something like this
void GenerateIL(Element[] array, ILGenerator ilGenerator)
{
for(int i = 0; i < array.Length; ++i)
{
// for each element we emit calls that push "array[i]" and "array"
// to the stack, treating "i" as constant,
// and emit a call to a method similar to Visit in the above visitor that
// performs a computation similar to Visitor.Visit.
}
}
Then I call the generated code.. and it executes slower than double dispatch that I have with the visitor pattern when calling Compute(array);
Upvotes: 1
Views: 1630
Reputation: 19143
Have you tried to fool JIT into using faster memory by enclosing the loop in a try-catch block? This also has the advantage of removing the exit condition and so saves you a bit of IL.
try
{
for (int i= 0; ; i++)
{
var visitor = new Visitor(array);
for(int i = 0;; ++i)
array[i].Accept(visitor);
}
}
catch (IndexOutOfRangeException)
{ }
It looks awful but it takes advantage of a JIT memory allocation quirk that may help fix your IL performance issue.
See Optimisation of for loop for more info on this.
Upvotes: 1
Reputation: 11348
Im curious why the title is dynamic method. When you are generating the IL. You mean dynamic IL generation and then static execution. or are you also generating IL that uses the IL equivalent of the c# dynamic keyword?
Dynamic(runtime) IL
Ill assume code is jitted only once. And you have checked this.
The use of array rather than generics in the sample provided adds the mystery. The issue is with the generated IL not the code generating the IL. But if you have used ARRAYs in the generated IL you will be using box unbox. Expensive STACK/HEAP and back operations.
Does you IL generate Code that uses BOX and UNBOX. IL operations? I would start right there.
Collection initialization is the next place to look.
Just some other quick thoughts: Marking LARGE sections of code in the hope of saving method call overhead can have a negative impact on JIT times. As the compiler must deal with the whole method/member. If you have small methods, it compiles when needed. But you said it wasnt a JIT issue.
These LARGE methods may have large stack operations ?
Any large Value objects inside methods that get called often ? eg STRUCT objects that have > 64 bytes ? Allocating and destroying stack each time.
What does RedGate performance profiler tell you ? http://www.red-gate.com/products/dotnet-development/ants-performance-profiler/?utm_source=google&utm_medium=cpc&utm_content=unmet_need&utm_campaign=antsperformanceprofiler&gclid=CIXamdiA6bICFQYcpQodDA0Akw
BTW im a novice at IL. Just throwing a few ideas out there.
good luck.
Upvotes: 1
Reputation:
If you are really interested in super optimising your code you need to learn IL!
Take a look at the IL OP codes at the following link...
http://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes(v=vs.95).aspx
also use ILDasm to get a look at the code you produced from your methods...
Though I suspect you won't be able to optimize the IL very much and would be far better writing it in C++ and calling out to unmanaged code...
just a thought for you...
Good luck Matthew
Upvotes: 1
Reputation: 1434
If I have understood correct, you are trying to remove the overhead of calling a virtual method by emitting the code itself and calling the method directly. Such as instead of calling thousands of virtual functions you want to call one virtual function.
But, you want different objects to have same interface. You can achieve this only by virtual calls. Either implementing an interface, or using delegates, or emitting code. Yes, even if you emit code you need some kind of interface to call that method, which may be invoking the Delegate or casting it into func/action predefined delegates.
If you want to have some efficient way to emit code I suggest using "LambdaExpression.CompileToMethod". That method is taking a method builder and you already have one I assume. You can see a lot of examples around internet. But still, this will result in a virtual call too.
As a result if you want to have same interface in many objects you cannot have non-virtual calls unless you put your objects into different bins regarding their types. Which is against polymorhism.
Upvotes: 1