dmg
dmg

Reputation: 605

shared variable between two threads behaves differently from shared property

In his excellent treatise on threading in C#, Joseph Albahari proposed the following simple program to demonstrate why we need to use some form of memory fencing around data that is read and written by multiple threads. The program never ends if you compile it in Release mode and free-run it without debugger:

  static void Main()
  {
     bool complete = false;
     var t = new Thread(() =>
     {
        bool toggle = false;
        while (!complete) toggle = !toggle;
     });
     t.Start();
     Thread.Sleep(1000);
     complete = true;                  
     t.Join(); // Blocks indefinitely
  }

My question is, why does the following slightly modified version of the above program no longer block indefinitely??

class Foo
{
  public bool Complete { get; set; }
}

class Program
{
  static void Main()
  {
     var foo = new Foo();
     var t = new Thread(() =>
     {
        bool toggle = false;
        while (!foo.Complete) toggle = !toggle;
     });
     t.Start();
     Thread.Sleep(1000);
     foo.Complete = true;                  
     t.Join(); // No longer blocks indefinitely!!!
  }
}

Whereas the following still blocks indefinitely:

class Foo
{
  public bool Complete;// { get; set; }
}

class Program
{
  static void Main()
  {
     var foo = new Foo();
     var t = new Thread(() =>
     {
        bool toggle = false;
        while (!foo.Complete) toggle = !toggle;
     });
     t.Start();
     Thread.Sleep(1000);
     foo.Complete = true;                  
     t.Join(); // Still blocks indefinitely!!!
  }
}

As does the following:

class Program
{
  static bool Complete { get; set; }

  static void Main()
  {
     var t = new Thread(() =>
     {
        bool toggle = false;
        while (!Complete) toggle = !toggle;
     });
     t.Start();
     Thread.Sleep(1000);
     Complete = true;                  
     t.Join(); // Still blocks indefinitely!!!
  }
}

Upvotes: 6

Views: 3242

Answers (4)

Orion Edwards
Orion Edwards

Reputation: 123642

To expand on Eric Petroelje's answer.

If we rewrite the program as follows (the behaviour is identical, but avoiding the lambda function makes it easier to read the dissassembly), we can dissasemble it and see what it actually means to "cache the value of a field in a register"

class Foo
{
    public bool Complete; // { get; set; }
}

class Program
{
    static Foo foo = new Foo();

    static void ThreadProc()
    {
        bool toggle = false;
        while (!foo.Complete) toggle = !toggle;

        Console.WriteLine("Thread done");
    }

    static void Main()
    {
        var t = new Thread(ThreadProc);
        t.Start();
        Thread.Sleep(1000);
        foo.Complete = true;
        t.Join();
    }
}

We get the following behaviour:

                Foo.Complete is a Field  |   Foo.Complete is a Property
x86-RELEASE  |      loops forever        |          completes  
x64-RELEASE  |        completes          |          completes  

in x86-release, the CLR JIT compiles the while(!foo.Complete) into this code:

Complete is a field:

004f0153 a1f01f2f03      mov     eax,dword ptr ds:[032F1FF0h] # Put a pointer to the Foo object in EAX
004f0158 0fb64004        movzx   eax,byte ptr [eax+4]  # Put the value pointed to by  [EAX+4] into EAX (this basically puts the value of .Complete into EAX)
004f015c 85c0            test    eax,eax   # Is EAX zero? (is .Complete false?)
004f015e 7504            jne     004f0164  # If it is not, exit the loop
# start of loop
004f0160 85c0            test    eax,eax   # Is EAX zero? (is .Complete false?)
004f0162 74fc            je      004f0160  # If it is, goto start of loop

The last 2 lines are the problem. If eax is zero, then it will just sit there in an infinite loop saying "is EAX zero?", without any code ever changing the value of eax!

Complete is a property:

00220155 a1f01f3a03      mov     eax,dword ptr ds:[033A1FF0h] # Put a pointer to the Foo object in EAX
0022015a 80780400        cmp     byte ptr [eax+4],0 # Compare the value at [EAX+4] with zero (is .Complete false?)
0022015e 74f5            je      00220155 # If it is, goto 2 lines up

This actually looks like nicer code. While the JIT has inlined the property getter (otherwise you'd see some call instructions going off to other functions) into some simple code to read the Complete field directly, because it's not allowed to cache the variable, when it generates the loop, it repeatedly reads the memory over and over again, rather than just pointlessly reading the register

in x64-release, the 64 bit CLR JIT compiles the while(!foo.Complete) into this code

Complete is a field:

00140245 48b8d82f961200000000 mov rax,12962FD8h # put 12E12FD8h into RAX. 12E12FD8h is a pointer-to-a-pointer in some .NET static object table
0014024f 488b00          mov     rax,qword ptr [rax] # Follow the above pointer; puts a pointer to the Foo object in RAX
00140252 0fb64808        movzx   ecx,byte ptr [rax+8] # Add 8 to the pointer to Foo object (it now points to the .Complete field) and put that value in ECX
00140256 85c9            test    ecx,ecx # Is ECX zero ? (is the .Complete field false?)
00140258 751b            jne     00140275 # If nonzero/true, exit the loop
0014025a 660f1f440000    nop     word ptr [rax+rax]  # Do nothing!
# start of loop
00140260 48b8d82f961200000000 mov rax,12962FD8h # put 12E12FD8h into RAX. 12E12FD8h is a pointer-to-a-pointer in some .NET static object table 
0014026a 488b00          mov     rax,qword ptr [rax] # Follow the above pointer; puts a pointer to the Foo object in RAX 
0014026d 0fb64808        movzx   ecx,byte ptr [rax+8] # Add 8 to the pointer to Foo object (it now points to the .Complete field) and put that value in ECX
00140271 85c9            test    ecx,ecx # Is ECX Zero ? (is the .Complete field true?)
00140273 74eb            je      00140260 # If zero/false, go to start of loop

Complete is a property

00140250 48b8d82fe11200000000 mov rax,12E12FD8h # put 12E12FD8h into RAX. 12E12FD8h is a pointer-to-a-pointer in some .NET static object table
0014025a 488b00          mov     rax,qword ptr [rax] # Follow the above pointer; puts a pointer to the Foo object in RAX
0014025d 0fb64008        movzx   eax,byte ptr [rax+8] # Add 8 to the pointer to Foo object (it now points to the .Complete field) and put that value in EAX
00140261 85c0            test    eax,eax # Is EAX 0 ? (is the .Complete field false?)
00140263 74eb            je      00140250 # If zero/false, go to the start

The 64-bit JIT is doing the same thing for both properties and fields, except when it's a field it's "unrolled" the first iteration of the loop - this basically puts an if(foo.Complete) { jump past the loop code } in front of it for some reason.

In both cases, it's doing a similar thing to the x86 JIT when dealing with a property:
- It inlines the method to a direct memory read - It doesn't cache it, and re-reads the value each time

I'm not sure if the 64 bit CLR is not allowed to cache the field value in the register like the 32 bit one does, but if it is, it's not bothering to do so. Perhaps it will in future?

At any rate, this illustrates how the behaviour is platform dependent and subject to change. I hope it helps :-)

Upvotes: 0

Eric Petroelje
Eric Petroelje

Reputation: 60498

In the first example Complete is a member variable and could be cached in register for each thread. Since you aren't using locking, updates to that variable may not be flushed to main memory and the other thread will see a stale value for that variable.

In the second example, where Complete is a property, you are actually calling a function on the Foo object to return a value. My guess would be that while simple variables may be cached in registers, the compiler may not always optimize actual properties that way.

EDIT:

Regarding the optimization of automatic properties - I don't think there is anything guaranteed by the specification in that regard. You are essentially banking on whether or not the compiler/runtime will be able to optimize out the getter/setter or not.

In the case where it is on the same object, it seems like it does. In the other case, it seems like it does not. Either way, I wouldn't bet on it. The easiest way to solve this would be to use a simple member variable and mark is as volotile to ensure that it is always synced with main memory.

Upvotes: 7

Chris Shain
Chris Shain

Reputation: 51329

The other answers explain what happens in technically correct terms. Let me see if I can explain it in english.

The first example says "Loop until this variable location is true." The new thread creates a copy of that variable location (because it is a value type) and proceeds to loop forever. If the variable had happened to be a reference type, it would have made a copy of the reference, but since the reference happened to point to the same memory location it would have worked.

The second example says "Loop until this method (the getter) returns true." The new thread cannot create a copy of a method, so it creates a copy of the reference to the instance of the class in question, and repeatedly calls the getter on that instance until it returns true (repeatedly reading the same variable location that is set to true in the main thread).

The third example is the same as the first. The fact that the closed variable happens to be a member of another class instance is not relevant.

Upvotes: 3

Tejs
Tejs

Reputation: 41236

This is because in the first snippet you provided, you made a lambda expression that closed over the boolean value complete - so, when the compiler rewrites that, it captures a copy of the value, not a reference. Likewise, in the second one, it's capturing a reference instead of a copy, due to closing over the Foo object, and thus when you change the underlying value, the change is noticed because of the reference.

Upvotes: 5

Related Questions