When to create and distribute "reference assemblies"?

Question

C# 7.1 introduced a few new command line parameters to help create "reference assemblies". By documentation it outputs an assembly which:

have their method bodies replaced with a single throw null body, but include all members except anonymous types.

I've found an interesting note that it is more stable on changes:

That means it changes less often than the full assembly--many common development activities don't change the interface, only the implementation. That means that incremental builds can be much faster- ...

and that it is probably necessary for roslyn itself ..

We will be introducing a second concept, which is "reference assemblies" (also called skeleton assemblies). [---] They will be used for build scenarios.

.. whatever those "build scenarios" are for Roslyn.

I understand that for ordinary .NET assembly users, such assembly is probably smaller and slightly faster to load for reflection. Ok, but:

usually you also care about execution and the implementation assembly already contains all the data from reference assembly,
quite often you don't care about that minor performance difference on loading,
and most importantly - usually you don't have that stripped-down reference assembly available (distributed) at all.

It's usefulness seems rather niche.

So, I wonder about the general assembly producer side of things - when should one consider explicitly using those new compiler flags to create a reference assembly? Does it have a any practical use outside Roslyn itself at all?

Rainer Sigwald · Accepted Answer

The motivation for this feature is indeed build scenarios, but they're not specific to Roslyn; they're your build scenarios, too.

When you build your project, the build engine (MSBuild) needs to decide whether each output of the build is up to date with respect to its inputs. For example, if you don't change anything and just run build twice in a row, the second time doesn't need to invoke the C# compiler: the assembly was already correct.

Reference assemblies allow skipping the compile step for assemblies in more scenarios, so your builds can be faster. I think an example would help illustrate.

Suppose you have a solution containing B.exe that depends on A.dll.

The compiler command line for B would look something like

csc.exe /out:B.exe /r:..\A\bin\A.dll Program.cs

And its inputs would be

The source for B (Program.cs)
The assembly for A.

If you change the source of A and build your solution, the compiler must run for A, producing a new A.dll. Then, since A.dll is an input to the compilation of B, B has to be recompiled, too.

Using a reference assembly for A changes this slightly

csc.exe /out:B.exe /r:..\A\bin\ref\A.dll Program.cs

The input for A is now its reference assembly, rather than its implementation/normal assembly.

Since the reference assembly is smaller than the full assembly, that has a minor effect on build time all by itself. But that's not enough to justify this feature. What's important is that the compiler only cares about the public API surface of the passed-in references. If an internal implementation detail of the assembly has changed, assemblies that reference it do not need to be recompiled to pick up the new behavior. As @Hans Passant mentions in comments, this is how the .NET Framework itself can deliver compatible performance improvements and bug fixes on unchanged user code.

The benefit of the reference assemblies feature comes from the MSBuild work done to use them. Suppose you change an internal implementation detail in A but don't change its public interface. On the next build,

The compiler must run for A, because source files for A changed.
The compiler emits both A.dll (with the changed implementation) and ref\A.dll, which is identical to the previous reference assembly.
Since ref\A.dll is identical to the previous output, it does not get copied to A's output folder.
When it is time for B's compiler to run, it sees that none of its inputs have changed--neither B's own code, nor A's reference assembly, so the compiler doesn't have to run.
B then copies the updated A.dll to its output and is ready to run with the new behavior.

The effect of skipping downstream compilation can compound as you go along in a large solution--changing a comment in {ProjectName}.Utilities.dll no longer requires building everything!

Many changes involve changing both the public API surface and the internal implementation, so this change doesn't speed up all builds, but it does speed up many builds.

When to create and distribute "reference assemblies"?

Answers (1)

Related Questions

When to create and distribute &quot;reference assemblies&quot;?

Answers (1)

Related Questions

When to create and distribute "reference assemblies"?