Reputation: 34880
I am doing some tinkering on the performance of virtual vs sealed members.
Below is my test code.
The output is
virtual total 3166ms
per call virtual 3.166ns
sealed total 3931ms
per call sealed 3.931ns
I must be doing something wrong because according to this the virtual call is faster than the sealed call.
I am running in Release mode with "Optimize code" turned on.
Edit: when running outside of VS (as a console app) the times are close to a dead heat. but the virtual almost always comes out in front.
[TestFixture]
public class VirtTests
{
public class ClassWithNonEmptyMethods
{
private double x;
private double y;
public virtual void VirtualMethod()
{
x++;
}
public void SealedMethod()
{
y++;
}
}
const int iterations = 1000000000;
[Test]
public void NonEmptyMethodTest()
{
var foo = new ClassWithNonEmptyMethods();
//Pre-call
foo.VirtualMethod();
foo.SealedMethod();
var virtualWatch = new Stopwatch();
virtualWatch.Start();
for (var i = 0; i < iterations; i++)
{
foo.VirtualMethod();
}
virtualWatch.Stop();
Console.WriteLine("virtual total {0}ms", virtualWatch.ElapsedMilliseconds);
Console.WriteLine("per call virtual {0}ns", ((float)virtualWatch.ElapsedMilliseconds * 1000000) / iterations);
var sealedWatch = new Stopwatch();
sealedWatch.Start();
for (var i = 0; i < iterations; i++)
{
foo.SealedMethod();
}
sealedWatch.Stop();
Console.WriteLine("sealed total {0}ms", sealedWatch.ElapsedMilliseconds);
Console.WriteLine("per call sealed {0}ns", ((float)sealedWatch.ElapsedMilliseconds * 1000000) / iterations);
}
}
Upvotes: 4
Views: 348
Reputation: 1007
Using as reference for our test the following code, let's analyze the Microsoft intermediate language (MSIL) information generated by the compiler by using the Ildasm.exe (IL Disassembler) tool.
public sealed class Sealed
{
public string Message { get; set; }
public void DoStuff() { }
}
public class Derived : Base
{
public sealed override void DoStuff() { }
}
public class Base
{
public string Message { get; set; }
public virtual void DoStuff() { }
}
static void Main()
{
Sealed sealedClass = new Sealed();
sealedClass.DoStuff();
Derived derivedClass = new Derived();
derivedClass.DoStuff();
Base BaseClass = new Base();
BaseClass.DoStuff();
}
To run this tool, open the Developer Command Prompt for Visual Studio and execute the command ildasm.
**********************************************************************
** Visual Studio 2017 Developer Command Prompt v15.9.13
** Copyright (c) 2017 Microsoft Corporation
**********************************************************************
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community>ildasm
Once the application is started, load the executable (or assembly) of the previous application
No alt text provided for this image Double click on the Main method to view the Microsoft intermediate language (MSIL) information.
.method private hidebysig static void Main() cil managed
{
.entrypoint
// Code size 41 (0x29)
.maxstack 8
IL_0000: newobj instance void ConsoleApp1.Program/Sealed::.ctor()
IL_0005: callvirt instance void ConsoleApp1.Program/Sealed::DoStuff()
IL_000a: newobj instance void ConsoleApp1.Program/Derived::.ctor()
IL_000f: callvirt instance void ConsoleApp1.Program/Base::DoStuff()
IL_0014: newobj instance void ConsoleApp1.Program/Base::.ctor()
IL_0019: callvirt instance void ConsoleApp1.Program/Base::DoStuff()
IL_0028: ret
} // end of method Program::Main
As you can see each class use newobj to create a new instance by pushing an object reference onto the stack and callvirt to calls a late-bound of the DoStuff() method of its respective object.
Judging on this information seems that both sealed, derived and base classes are managed in the same way by the compiler. Just to be sure, let's get deeper by analyzing the JIT-compiled code with the Disassembly window in Visual Studio.
Enable the Disassembly by selecting Enable address-level debugging, under Tools > Options > Debugging > General.
No alt text provided for this image Set the a brake point at the beginning of the application and start the debug. Once the application hits the brake-point open the Disassembly window by selecting Debug > Windows > Disassembly.
--- C:\Users\Ivan Porta\source\repos\ConsoleApp1\Program.cs --------------------
{
0066084A in al,dx
0066084B push edi
0066084C push esi
0066084D push ebx
0066084E sub esp,4Ch
00660851 lea edi,[ebp-58h]
00660854 mov ecx,13h
00660859 xor eax,eax
0066085B rep stos dword ptr es:[edi]
0066085D cmp dword ptr ds:[5842F0h],0
00660864 je 0066086B
00660866 call 744CFAD0
0066086B xor edx,edx
0066086D mov dword ptr [ebp-3Ch],edx
00660870 xor edx,edx
00660872 mov dword ptr [ebp-48h],edx
00660875 xor edx,edx
00660877 mov dword ptr [ebp-44h],edx
0066087A xor edx,edx
0066087C mov dword ptr [ebp-40h],edx
0066087F nop
Sealed sealedClass = new Sealed();
00660880 mov ecx,584E1Ch
00660885 call 005730F4
0066088A mov dword ptr [ebp-4Ch],eax
0066088D mov ecx,dword ptr [ebp-4Ch]
00660890 call 00660468
00660895 mov eax,dword ptr [ebp-4Ch]
00660898 mov dword ptr [ebp-3Ch],eax
sealedClass.DoStuff();
0066089B mov ecx,dword ptr [ebp-3Ch]
0066089E cmp dword ptr [ecx],ecx
006608A0 call 00660460
006608A5 nop
Derived derivedClass = new Derived();
006608A6 mov ecx,584F3Ch
006608AB call 005730F4
006608B0 mov dword ptr [ebp-50h],eax
006608B3 mov ecx,dword ptr [ebp-50h]
006608B6 call 006604A8
006608BB mov eax,dword ptr [ebp-50h]
006608BE mov dword ptr [ebp-40h],eax
derivedClass.DoStuff();
006608C1 mov ecx,dword ptr [ebp-40h]
006608C4 mov eax,dword ptr [ecx]
006608C6 mov eax,dword ptr [eax+28h]
006608C9 call dword ptr [eax+10h]
006608CC nop
Base BaseClass = new Base();
006608CD mov ecx,584EC0h
006608D2 call 005730F4
006608D7 mov dword ptr [ebp-54h],eax
006608DA mov ecx,dword ptr [ebp-54h]
006608DD call 00660490
006608E2 mov eax,dword ptr [ebp-54h]
006608E5 mov dword ptr [ebp-44h],eax
BaseClass.DoStuff();
006608E8 mov ecx,dword ptr [ebp-44h]
006608EB mov eax,dword ptr [ecx]
006608ED mov eax,dword ptr [eax+28h]
006608F0 call dword ptr [eax+10h]
006608F3 nop
}
0066091A nop
0066091B lea esp,[ebp-0Ch]
0066091E pop ebx
0066091F pop esi
00660920 pop edi
00660921 pop ebp
00660922 ret
As we can see in the previous code, while the creation of the objects is the same, the instruction executed to invoke the methods of the sealed and derived/base class are slightly different. After moving data into registers of the RAM (mov instruction), the invoke of the sealed method, execute a comparison between dword ptr [ecx] and ecx (cmp instruction) before actually call the method.
According to the report written by Torbj¨orn Granlund, Instruction latencies and throughput for AMD and Intel x86 processors, the speed of the following instruction in a Intel Pentium 4 are:
In conclusion, the optimization of the now days compilers and processors have made the performances between sealed and not-sealed classed basically so little that are irrelevant to the majority of the applications.
References
Upvotes: 0
Reputation: 942187
You are testing the effects of memory alignment on code efficiency. The 32-bit JIT compiler has trouble generating efficient code for value types that are more than 32 bits in size, long and double in C# code. The root of the problem is the 32-bit GC heap allocator, it only promises alignment of allocated memory on addresses that are a multiple of 4. That's an issue here, you are incrementing doubles. A double is efficient only when it is aligned on an address that's a multiple of 8. Same issue with the stack, in case of local variables, it is also aligned only to 4 on a 32-bit machine.
The L1 CPU cache is internally organized in blocks called a "cache line". There is a penalty when the program reads a mis-aligned double. Especially one that straddles the end of a cache line, bytes from two cache lines have to be read and glued together. Mis-alignment isn't uncommon in the 32-bit jitter, it is merely 50-50 odds that the 'x' field happens to be allocated on an address that's a multiple of 8. If it isn't then 'x' and 'y' are going to be misaligned and one of them may well straddle the cache line. The way you wrote the test, that's going to either make VirtualMethod or SealedMethod slower. Make sure you let them use the same field to get comparable results.
The same is true for code. Swap the code for the virtual and sealed test to arbitrarily change the outcome. I had no trouble making the sealed test quite a bit faster that way. Given the modest difference in speed, you are probably looking at a code alignment issue. The x64 jitter makes an effort to insert NOPs to get a branch target aligned, the x86 jitter doesn't.
You should also run the timing test several times in a loop, at least 20. You are likely to then also observe the effect of the garbage collector moving the class object. The double may have a different alignment afterward, dramatically changing the timing. Accessing a 64-bit value type value like long or double has 3 distinct timings, aligned on 8, aligned on 4 within a cache line, and aligned on 4 across two cache lines. In fast to slow order.
The penalty is steep, reading a double that straddles a cache line is roughly three times slower than reading an aligned one. Also the core reason why a double[] (array of doubles) is allocated in the Large Object Heap even when it has only 1000 elements, well south of the normal threshold of 80KB, the LOH has an alignment guarantee of 8. These alignment problems entirely disappear in code generated by the x64 jitter, both the stack and the GC heap have an alignment of 8.
Upvotes: 4
Reputation: 78673
You might be seeing some start up cost. Try wrapping the Test-A/Test-B code in a loop and run it several times. You might also be seeing some kind of ordering effects. To avoid that (and top/bottom of loop effects), unroll it 2-3 times.
Upvotes: 1
Reputation: 117310
First, you have to mark the method sealed
.
Secondly, provide an override
to the virtual method. Create an instance of the derived class.
As a third test, create a sealed override
method.
Now you can start comparing.
Edit: You should probably run this outside VS.
Update:
Example of what I mean.
abstract class Foo
{
virtual void Bar() {}
}
class Baz : Foo
{
sealed override void Bar() {}
}
class Woz : Foo
{
override void Bar() {}
}
Now test the call speed of Bar
for an instance of Baz
and Woz
.
I also suspect member and class visibility outside the assembly could affect JIT analysis.
Upvotes: 1