Reputation: 5514
So I'm emitting some dynamic proxies via DefineDynamicAssembly
, and while testing I found that:
In my test I generate 10,000 types, and the one-type-per-assembly code runs about 8-10 times faster. The memory usage is completely in line with what I expected, but how come the time to generate the types is that much longer?
Edit: Added some sample code.
One assembly:
var an = new AssemblyName( "Foo" );
var ab = AppDomain.CurrentDomain.DefineDynamicAssembly( an, AssemblyBuilderAccess.Run );
var mb = ab.DefineDynamicModule( "Bar" );
for( int i = 0; i < 10000; i++ )
{
var tb = mb.DefineType( "Baz" + i.ToString( "000" ) );
var met = tb.DefineMethod( "Qux", MethodAttributes.Public );
met.SetReturnType( typeof( int ) );
var ilg = met.GetILGenerator();
ilg.Emit( OpCodes.Ldc_I4, 4711 );
ilg.Emit( OpCodes.Ret );
tb.CreateType();
}
One assembly per type:
for( int i = 0; i < 10000; i++ )
{
var an = new AssemblyName( "Foo" );
var ab = AppDomain.CurrentDomain.DefineDynamicAssembly( an,
AssemblyBuilderAccess.Run );
var mb = ab.DefineDynamicModule( "Bar" );
var tb = mb.DefineType( "Baz" + i.ToString( "000" ) );
var met = tb.DefineMethod( "Qux", MethodAttributes.Public );
met.SetReturnType( typeof( int ) );
var ilg = met.GetILGenerator();
ilg.Emit( OpCodes.Ldc_I4, 4711 );
ilg.Emit( OpCodes.Ret );
tb.CreateType();
}
Upvotes: 19
Views: 803
Reputation: 2337
In my checks about why defining multiple modules in one assembly is slower than to create a new assembly with one module, using these pieces of code:
Single-Assembly Scenario:
var an = new AssemblyName("Foo");
var ab = AppDomain.CurrentDomain.DefineDynamicAssembly(an, AssemblyBuilderAccess.Run);
for (int i = 0; i < 10000; i++)
{
ab.DefineDynamicModule("Bar" + i.ToString("000"));
}
Multi-Assembly Scenario:
var an = new AssemblyName("Foo");
for (int i = 0; i < 10000; i++)
{
var ab = AppDomain.CurrentDomain.DefineDynamicAssembly(an, AssemblyBuilderAccess.Run);
ab.DefineDynamicModule("Bar");
}
DefineDynamicModule()
is under pressure. However, when using multiple assemblies, this method never gets called; instead, other methods are responsible for the remaining 50%.Let's go deeper inside the ECMA-335 documentation for CLI.
II.6 An assembly is a set of one or more files deployed as a unit.
Page 140
So we understand now that an assembly is essentially a package and modules are the main components. That being said:
II.6 A module is a single file containing executable content in the format specified here. If the module contains a manifest then it also specifies the modules (including itself) that constitute the assembly. An assembly shall contain only one manifest amongst all its constituent files.
Page 140
Based on this information, we know that when we create the assembly, we automatically add one module to the assembly as well. This is why we never get a hit on the CLI's DefineDynamicModule()
function if we keep creating new assemblies. Instead, we get a hit on the CLI's GetInMemoryAssemblyModule()
method to retrieve information about the Manifest Module (the module that is created automatically).
So here we have a little performance gain; with one assembly, we get 10001 modules, but with multiple assemblies, we get a total of 10000 modules. Not much though, so this one extra module should not be the main reason behind this.
II.6.5 When an item is in the current assembly, but is part of a module other than the one containing the manifest, the defining module shall be declared in the manifest of the assembly using the .module extern directive.
Page 146
and
II.6.7 The manifest module, of which there can only be one per assembly, includes the .assembly directive. To export a type defined in any other module of an assembly requires an entry in the assembly’s manifest.
Page 146
Therefore, each time you create a new module, you are actually adding a new file to an archive, and then modifying the first file of the archive to reference the new module. Essentially in the single-assembly code, we are adding 10000 modules, and then we edit the first module 10000 times. This isn't the case with the multi-assembly code in which we only edit the first automatically generated module, 10000 times.
This is the overhead we see. And it increases exponentially on my system.
(5000 = 1.5s, 10000 = 6s, 20000 = 25s)
With your code, however, the bottleneck is the unmanaged CLR's SetMethodIL
function called from the CreateTypeNoLock.CreateTypeNoLock()
method and I couldn't find anything in the documentation about this, yet.
Unfortunately, though, it is hard to decompile and understand CLR.dll to see what actually happens there and as the result, we are just making guesses based on the public published information by Microsoft at this stage.
Upvotes: 8
Reputation: 26926
On my PC in LINQPad using C# 7.0 I get one assembly about 8.8 seconds, one assembly per type about 2.6 seconds. Most of the time in the one assembly is in DefineType
and CreateType
whereas in the time is mainly in DefineDynamicAssembly
+DefineDynamicModule
.
DefineType
checks there is no name conflicts, which is a Dictionary
lookup. If the Dictionary
is empty, this is about a check for null
.
The majority of the time is spent in CreateType
, but I don't see where, however it appears that it requires extra time adding types to a single Module.
Creating multiple modules slows the whole process down, but most of the time is spent creating modules and in DefineType
, which has to scan every module for a duplicate, so now has increasing up to 10,000 null
checks. With a unique module per type, CreateType
is very fast.
Upvotes: 8