Reputation: 7705
I'm working on an engine where we copy around lots and lots of properties dynamically at runtime. Depending on the situation, we may or may not modify the property value along the way. It was originally written with reflection, but due to performance issues, we recently re-wrote it in Reflection.Emit
. The re-write is complete and performance is obviously a lot better, but now the code is being benchmarked against hand-written C#
. Obviously, to be a fair fight, the hand-written C#
for the benchmarks has "similar functionality" (you'll see what I mean in a sec) as the IL
.
Some of the IL
engine has been signed off on as it has passed with flying colors and is pretty much 1:1 with the hand-written C#
. This tells me:
there is no overhead in calling the dynamic method
our general concept and implementation is correct
benchmarking is correct
IL
and handwritten C#
is being tested in exactly the same way, so no funny JIT
business is going on (I don't think)
We went in expecting the IL
to be slightly slower then the hand-written, but that has not been the case so far. It's maybe a few ms slower in long rounds, but you can take shortcuts in IL
, so that helps make up the diff.
In one particular case, its substantially slower. 2x slower.
In C#
, you'd have:
class Source
{
public string S1 { get; set; }
public int I1 { get; set; }
public int I2 { get; set; }
public double D1 { get; set; }
public double D2 { get; set; }
public double D3 { get; set; }
}
class Dest
{
public string S1 { get; set; }
public int I1 { get; set; }
public string I2 { get; set; }
public double D1 { get; set; }
public int D2 { get; set; }
public string D3 { get; set; }
}
static Dest Test(Source s)
{
Dest d = new Dest();
object o = s.D3;
if (o != null)
d.D3 = o.ToString();
return d;
}
This is what I meant by similar functionality. To be generic, when we copy a property to a string, we first box it and then call Object.ToString()
. Natively, value types call ToString
different, thus the code above, to be apples to apples.
If I comment out the D3
copy/ToString
and uncomment the other 5 properties, I'm back to 1:1 with the C#
.
You'll notice that I2
is int
-> string
, but for some reason, that one doesn't have the same problem as with the double
-> string
. I get that double ToString()
is more expensive in general, but that expense should show up in the C# code too, but it doesn't.
The code I emit for the D3
copy is the same code I emit for the I2
copy, why the huge overhead on the D3
copy?
EDIT:
The compiler emits:
IL_0000: newobj instance void ConsoleApplication3.Dest::.ctor()
IL_0005: ldarg.0
IL_0006: callvirt instance float64 ConsoleApplication3.Source::get_D3()
IL_000b: box [mscorlib]System.Double
IL_0010: stloc.0
IL_0011: dup
IL_0012: ldloc.0
IL_0013: brtrue.s IL_0018
IL_0015: ldnull
IL_0016: br.s IL_001e
IL_0018: ldloc.0
IL_0019: callvirt instance string [mscorlib]System.Object::ToString()
IL_001e: callvirt instance void ConsoleApplication3.Dest::set_D3(string)
IL_0023: ret
This particular section of my code does not emit the new for the Dest object, that's done elsewhere. The dup is dupeing the Dest object as seen in the C#
above.
LocalBuilder localBuilderObject = generator.DeclareLocal(_typeOfObject);
Label labelNull = generator.DefineLabel();
Label labelNotNull = generator.DefineLabel();
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Callvirt, miGetter);
generator.Emit(OpCodes.Box, typeSource);
generator.Emit(OpCodes.Stloc_S, localBuilderObject);
generator.Emit(OpCodes.Dup);
generator.Emit(OpCodes.Ldloc_S, localBuilderObject);
generator.Emit(OpCodes.Brtrue, labelNotNull);
generator.Emit(OpCodes.Ldnull);
generator.Emit(OpCodes.Br, labelNull);
generator.MarkLabel(labelNotNull);
generator.Emit(OpCodes.Ldloc_S, localBuilderObject);
generator.Emit(OpCodes.Callvirt, _miToString);
generator.MarkLabel(labelNull);
generator.Emit(OpCodes.Callvirt,miSetter);
As I mentioned, I box the type so I can call Object::ToString()
generically without worrying about value types. Ref types go through this path as well. The C#
code is made to behave like this and still takes 1/2 the time???
I've been messing with this issue all weekend. Further testing shows other value types are 1:1. int
, long
, etc. For some reason the double
is causing a problem.
Upvotes: 2
Views: 862
Reputation: 772
Jump over if null
(brfalse) instead of double jump. Your benchmark may be false for 3 reasons based on the way (not posted here) you call your generated code :
Upvotes: 1
Reputation: 10287
As you can see in the C#
compiled code, fast local-access instructions are used:
IL_000b: box [mscorlib]System.Double
IL_0010: stloc.0
IL_0011: dup
IL_0012: ldloc.0
...
IL_0018: ldloc.0
Instead, in your IL
generated code, you use stloc.s
and ldloc.s
which also take an operand of the local index.
Also make sure that you cache (you probably are if the C#
runs only twice faster) the generated method per Type
it's being generated for.
Upvotes: 1