Arseni Mourzenko
Arseni Mourzenko

Reputation: 52321

Why JIT compiler doesn't remove an unused variable?

A recent comment to my answer suggested that a variable is created twice.

At first, I started writing the following comment:

I'm pretty sure .NET's JIT compiler will rewrite the code by moving the declarations of both variables to the place they are actually used. [...]

But then I decided to check my claims. To my surprise, it looks that I'm plainly wrong.

Let's start with the following piece of code:

class Something
{
    public string text;
    public int number;
    public Something(string text, int number)
    {
        Console.WriteLine("Initialized {0}.", number);
        this.text = text;
        this.number = number;
    }
}

static void Display(Something something)
{
    Console.WriteLine(something.text, something.number);
}

static int x = 0;

public static void Main()
{
    var first = new Something("Hello, {0}!", 123);
    var second = new Something("World, {0}!", 456);

    Display(x > 0 ? first : second);
}

Warning: the code is a POC and has severe style issues such as public fields; don't write code like this outside prototypes.

The output is the following:

Initialized 123.
Initialized 456.
World, 456!

Let's change the Main() method a bit:

void Main()
{
    Display(
        x > 0 ?
        new Something("Hello, {0}!", 123) :
        new Something("World, {0}!", 456));
}

Now the output becomes:

Initialized 456.
World, 456!

By the way, if I look at IL of the modified version, both newobj instructions are still there:

IL_0000:  ldarg.0     
IL_0001:  ldarg.0     
IL_0002:  ldfld       UserQuery.x
IL_0007:  ldc.i4.0    
IL_0008:  bgt.s       IL_001B
IL_000A:  ldstr       "World, {0}!"
IL_000F:  ldc.i4      C8 01 00 00 
IL_0014:  newobj      UserQuery+Something..ctor
IL_0019:  br.s        IL_0027
IL_001B:  ldstr       "Hello, {0}!"
IL_0020:  ldc.i4.s    7B 
IL_0022:  newobj      UserQuery+Something..ctor
IL_0027:  call        UserQuery.Display
IL_002C:  ret

This means that the compiler left both initialization instructions untouched, but JIT optimized them by keeping only one.

What is the reason JIT doesn't optimize the original piece of code by removing the unused variable and its assignment?

Upvotes: 2

Views: 811

Answers (3)

Kyle
Kyle

Reputation: 6684

I'm not sure what JIT optimization has to do with anything here. The two examples you provide exhibit different behavior because they're different, not because there's some kind of JIT optimization occurring.

When you write:

var first = new Something("Hello, {0}!", 123);
var second = new Something("World, {0}!", 456);

Display(x > 0 ? first : second);

That creates two Something objects executing each of their constructors to do so and stores a reference to each one in their respective variable. Then you use a ternary to select which of those to pass to the Display method. The way you've written the code, both constructors must be executed, it doesn't matter that only one of the instances may be used later.

Display(
    x > 0 ?
    new Something("Hello, {0}!", 123) :
    new Something("World, {0}!", 456));

This is entirely different. Now you have a ternary which will only evaluate one of its operands after the condition.

By the way, if I look at IL of the modified version, both newobj instructions are still there:

Of course they are, why wouldn't they be? The IL loads the static x field and compares it with 0. If x > 0 then code execution jumps over the first newobj opcode, so it never gets executed. Instead it executes the second one. If x <= 0, it executes the first newobj and then jumps over the second.

Both opcodes have to be there because there's two things that could happen. Just because they show up in the IL doesn't mean that they both get executed.

The compiler can't get rid of one because when it's compiling the Main method, it can't know that the static field x is going to be 0 when the method gets called at runtime. In order to do that it'd have to be capable of solving the halting problem, which is undecidable in general.

Upvotes: 0

Enfyve
Enfyve

Reputation: 936

First I want to preface this as being an answer because it is simply too complex to be a comment imo.

If you specify x to be static, the JIT produce the stated result, but what if you specify x to be a const? Do you observe the same results?

What about if you explicitly type 0 > 0 ? :

Related to your answer I feel it's more clear to interject that the JIT not removing it is also because if you insert a breakpoint and manually change x to be 1, the program would crash in some spectacular fashion.

Upvotes: 0

Arseni Mourzenko
Arseni Mourzenko

Reputation: 52321

While writing the question, it appeared to me that the answer is very simple. JIT optimizations are limited to the ones which are considered safe, and randomly removing a call to a constructor is all but safe, because the constructor may have side effects and actually have side effects in the sample code, since it displays a message to console.

The optimization (and the lack of) could be illustrated much easier:

static string Create(string name)
{
    Console.WriteLine(name);
    return name;
}

public static void Main()
{
    var first = Create("Jeff");
    var second = Create("Alice");
    Console.WriteLine("Hello, {0}!", second);
}

This code will output:

Created Jeff.
Created Alice.
Hello, Alice!

The JIT complier successfully understands that the method has a side effect—the output to the console—and doesn't remove the first call, even if first is never used. Here's the corresponding Assembly code:

006B2DA8  push        ebp  
006B2DA9  mov         ebp,esp  
006B2DAB  mov         ecx,dword ptr ds:[335230Ch]  
006B2DB1  call        dword ptr ds:[4770860h]     // Displays "Jeff" to console.
006B2DB7  mov         ecx,dword ptr ds:[3352310h]  
006B2DBD  call        dword ptr ds:[4770860h]     // Displays "Alice" to console.

006B2DC3  mov         ecx,dword ptr ds:[3352314h]  
006B2DC9  mov         edx,eax  
006B2DCB  call        702AE044                    // Displays "Hello, Alice!" to console.
006B2DD0  pop         ebp  
006B2DD1  ret

A slight change to this piece of code produces a very different result. By removing Console.WriteLine() statement in Create(), JIT now assumes that Create() barely returns the value of the argument. While the IL code still contains two calls to the Create() method:

IL_0000: ldc.i4.s 123
IL_0002: call int32 ConsoleApplication4.Program::Create(int32)
IL_0007: pop
IL_0008: ldc.i4 456
IL_000d: call int32 ConsoleApplication4.Program::Create(int32)
IL_0012: stloc.0
IL_0013: ldstr "Hello, {0}!"
IL_0018: ldloc.0
IL_0019: box [mscorlib]System.Int32
IL_001e: call void [mscorlib]System.Console::WriteLine(string, object)
IL_0023: ret

JIT compiler gets rid of the first call, producing Assembly code which is now much shorter:

00F12DA8  mov         edx,dword ptr ds:[3AA2310h]  
00F12DAE  mov         ecx,dword ptr ds:[3AA2314h]  
00F12DB4  call        702AE044  
00F12DB9  ret

Upvotes: 4

Related Questions