NeilB
NeilB

Reputation: 357

How does volatile integer fix this thread synchronization issue?

I am trying to solve bounder buffer problem by using spin lock. The condition variable "lock" needs to be defined as volatile when program is compiled with -O option because without it reader will spin in "while(lock == 0) forever. However, I found that even "count" also needs to be defined as volatile. Please see the code below.

#include <stdio.h>
#include <pthread.h>
#include <assert.h>

int count = 0;
volatile int lock = 0;

#define NUM_COUNT       5

static void *
writer(void *arg) {
    int i = 0;

    for(i=0; i < NUM_COUNT; i++) {
        while(lock == 1) {
        }

        printf("Writer count %d ", ++count);
        lock = 1;
    }

    return NULL;
}

static void *
reader(void *arg) {
    static int k = 0;

    while(count < NUM_COUNT) {
        while(lock == 0) {
        }

        printf("reader count %d \n", count);
        if (k > 0 && count!=++k) {
           assert(0);
           printf("reader count %d %d\n", count, k);
        }
        k = count;
        lock = 0;
    }

    return NULL;
}

int
main(void)
{
    pthread_t writer_pthread_id;
    pthread_t reader_pthread_id1;
    void *res;

    pthread_create(&reader_pthread_id1, NULL, reader, NULL);
    pthread_create(&writer_pthread_id, NULL, writer, NULL);


    pthread_join(writer_pthread_id, &res);
    printf("Joined with thread id %lu; return value was %p\n",
           writer_pthread_id, (char *)res);

    pthread_join(reader_pthread_id1, &res);
    printf("Joined with thread id %lu; return value was %p\n",
           reader_pthread_id1, (char *)res);


    printf("count = %d\n", count);

    return 0;
}

As per the output, reader seems not be reading the current value.

bash-3.2$ gcc -g -Wall -pthread -O    thread_synchro2_bounded_buffer_spin_lock.c bash-3.2$ ./a.out 
Writer count 1 reader count 0 
Writer count 2 reader count 1 
Writer count 3 reader count 2 
Writer count 4 reader count 3 
Writer count 5 reader count 4 
Joined with thread id 47779619322176; return value was (nil)    
Joined with thread id 47779608832320; return value was (nil) 
count =    5

However, if I change "count" to "volatile int count" then problem gets fixed.

bash-3.2$ gcc -g -Wall -pthread -O thread_synchro2_bounded_buffer_spin_lock.c
bash-3.2$ ./a.out
Writer count 1 reader count 1
Writer count 2 reader count 2
Writer count 3 reader count 3
Writer count 4 reader count 4
Writer count 5 reader count 5
Joined with thread id 47040774805824; return value was (nil)
Joined with thread id 47040764315968; return value was (nil)
count = 5

Usually, variable inside the critical section does not need to be volatile as it will not be changed asynchronously. Could someone please help me to understand what compiler optimization is causing this problem?

Upvotes: 2

Views: 140

Answers (1)

Dan Getz
Dan Getz

Reputation: 18217

The probable reason for the discrepancy between the volatile int count and regular int count is compiler optimization in the reader thread.

Since it needs to evaluate while (count < NUM_COUNT) in the reader thread, it already has count in a register, and later doesn't bother to read it from memory again to do printf("reader count %d \n", count);. When count is volatile, it has to read it again. Between these statements, the writer thread updates count.

But multi-threading is tricky and error-prone. Best to use carefully designed and tested idiomatic methods (like atomics library), or to otherwise avoid it altogether and use parallelism in some other part of the calculation.

--- UPDATED ---

Here is a diff of the assembly generated for the two versions of reader thread. It confirms the hypothesis above. For those who parse this, the relevant change are the two lines added above .LVL10 and the rest of the changes are mainly reloads of count:

--- reader.s.nonvolatile    2017-06-16 20:58:26.680644709 +0300
+++ reader.s.volatile   2017-06-16 20:58:29.143644664 +0300
@@ -6,8 +6,8 @@
    .cfi_startproc
 .LVL8:
    .loc 1 29 0
-   movl    count(%rip), %edx
-   cmpl    $4, %edx
+   movl    count(%rip), %eax
+   cmpl    $4, %eax
    jg  .L14
    .loc 1 26 0
    subq    $8, %rsp
@@ -18,6 +18,8 @@
    movl    lock(%rip), %eax
    testl   %eax, %eax
    je  .L10
+   .loc 1 33 0
+   movl    count(%rip), %edx        <--- Reads count again!
 .LVL10:
 .LBB14:
 .LBB15:
@@ -36,7 +38,8 @@
    .loc 1 34 0 is_stmt 0 discriminator 1
    addl    $1, %eax
    movl    %eax, k.3008(%rip)
-   cmpl    count(%rip), %eax
+   movl    count(%rip), %edx
+   cmpl    %edx, %eax
    je  .L11
    .loc 1 35 0 is_stmt 1
    movl    $__PRETTY_FUNCTION__.3012, %ecx
@@ -52,7 +55,7 @@
    .loc 1 39 0
    movl    $0, lock(%rip)
    .loc 1 29 0
-   movl    %eax, %edx
+   movl    count(%rip), %eax
    cmpl    $4, %eax
    jle .L10
    .loc 1 43 0

Upvotes: 1

Related Questions