danglingPointer
danglingPointer

Reputation: 916

Understanding automatic inlining: when can the compiler inline methods involving private variables & abstract methods?

Using C#, but I persume this question is relevant for other (most c related) languages as well. Consider this...

private float radius = 0.0f; // Set somewhere else
public float GetDiameter() {
   return radius * 2.0f;
}

Will the compiler inline this if called in other classes? I would think the answer is of course, but here is confusion: radius is private. So from a manual-programming perspective it would be impossible for us to inline this method since radius is private.

So what does the compiler do? I presume it can inline it anyhow, since if I remember correctly 'private' 'public' ect. modifiers only affect human written code and the assembly language can access any part of its own program if it wants?

Okay, but what about abstraction? Consider this...

public abstract class Animal {
   abstract public bool CanFly();
}

public class Hawk : Animal {
...
   override public bool CanFly() {
      if (age < 1.0f) return false; // Baby hawks can't fly yet
      return true;
   }
}

public class Dog : Animal {
...
   override public bool CanFly() {
      return false;
   }
}

In a non-animal class:

...
Animal a = GetNextAnimal();
if (a.CanFly()) {
...

Can this be inlined? I am almost certain no, because the compiler doesn't know what kind of animal is being used. But what if instead I did...

...
Animal a = new Hawk();
if (a.CanFly()) {
...

Does that make a difference? If not, surely this one can be?:

...
Hawk a = new Hawk();
if (a.CanFly()) {
...

Does anything change if, instead of a bool method above, I were to do:

float animalAge = a.GetAge();

In general, can too many abstract getters and setters cause a performance hit? If that gets to a point that is significant what would be the best solution?

Upvotes: 4

Views: 420

Answers (1)

Hans Passant
Hans Passant

Reputation: 942177

There is in general no simple way to predict up front whether or not a method will get inlined. You have to actually write a program and look at the machine code that is produced for it. This is pretty easy to do in a C program, you can ask the compiler to produce an assembly code listing (like /FA for MSVC, -S for GCC).

More convoluted in .NET due to the jitter just-in-time compiling the code. Technically the source code of the optimizer is available from the CoreCLR project but it is very hard to figure out what it does, lots of pretty impregnable C++ code. You have to take advantage of the "visual" in Visual Studio and use the debugger.

That requires a bit of preparation to be sure you get the actual optimized code, it normally disables the optimizer to make debugging easy. Switch to the Release configuration and use Tools > Options > Debugging > General > untick the "Suppress JIT optimization" checkbox. If you want optimal floating point code then you always, always want 64-bit code, so use Project > Properties > Build tab, untick "Prefer 32-bit".

And write a little test program to exercise the method. That can be tricky, you might easily end up with no code at all. In this case it is easy, Console.WriteLine() is a good way to force this method to be used, it cannot be optimized away. So:

class Program {
    static void Main(string[] args) {
        var obj = new Example();
        Console.WriteLine(obj.GetDiameter());
    }
}

class Example {
    private float radius = 0.0f;
    public float GetDiameter() {
        return radius * 2.0f;
    }
}

Set a breakpoint on Main() and press F5. Then use Debug > Windows > Disassembly to look at the machine code. On my machine with a Haswell core (supports AVX) I get:

00007FFEB9D50480  sub         rsp,28h                   ; setup stack frame
00007FFEB9D50484  mov         rcx,7FFEB9C45A78h         ; rcx = typeof(Example)
00007FFEB9D5048E  call        00007FFF19362530          ; rax = new Example()
00007FFEB9D50493  vmovss      xmm0,dword ptr [rax+8]    ; xmm0 = Example.field
00007FFEB9D50499  vmulss      xmm0,xmm0,dword ptr [7FFEB9D504B0h]  ; xmm0 *= 2.0
00007FFEB9D504A2  call        00007FFF01647BB0          ; Console.WriteLine()
00007FFEB9D504A7  nop                                   ; alignment
00007FFEB9D504A8  add         rsp,28h                   ; tear down stack frame
00007FFEB9D504AC  ret 

I annotated the code to help to make sense of it, can be cryptic if you never looked at it before. But no doubt you can tell that the method got inlined. No CALL instruction, it got inlined to two instructions (VMOVSS and VMULSS).

As you expected. Accessibility plays no role whatsoever in inlining decisions, it is a simple code hoisting trick that does not change the logical operation of the program. It matters to the C# compiler first, next to the verifier built into the jitter, but then disappears as a concern to the code generator and optimizer.

Just do the exact same thing for the abstract class. You'll see that the method does not get inlined, an indirect CALL instruction is required. Even if the method is completely empty. Some language compilers can turn virtual method calls into non-virtual calls when they know the type of the object but the C# compiler is not one of them. The jitter optimizer doesn't either. EDIT: recent work was done on devirtualizing calls.

There are other reasons why a method won't be inlined, a moving target so difficult to document. But roughly, methods with too much MSIL, try/catch/throw, loops, CAS demands, some degenerate struct cases, MarshalByRefObject base won't be inlined. Always look at actual machine code to be sure.

The [MethodImpl(MethodImplOptions.AgressiveInlining)] attribute can force the optimizer to reconsider the MSIL limit. MethodImplOptions.Noinlining is helpful to disable inlining, the kind of thing you might want to do to get a better exception stack trace or slow down the jitter because an assembly might not be deployed.

More about the optimizations performed by the jitter optimizer in this post.

Upvotes: 2

Related Questions