Reputation: 357
I am trying to solve bounder buffer problem by using spin lock. The condition variable "lock" needs to be defined as volatile when program is compiled with -O option because without it reader will spin in "while(lock == 0) forever. However, I found that even "count" also needs to be defined as volatile. Please see the code below.
#include <stdio.h>
#include <pthread.h>
#include <assert.h>
int count = 0;
volatile int lock = 0;
#define NUM_COUNT 5
static void *
writer(void *arg) {
int i = 0;
for(i=0; i < NUM_COUNT; i++) {
while(lock == 1) {
}
printf("Writer count %d ", ++count);
lock = 1;
}
return NULL;
}
static void *
reader(void *arg) {
static int k = 0;
while(count < NUM_COUNT) {
while(lock == 0) {
}
printf("reader count %d \n", count);
if (k > 0 && count!=++k) {
assert(0);
printf("reader count %d %d\n", count, k);
}
k = count;
lock = 0;
}
return NULL;
}
int
main(void)
{
pthread_t writer_pthread_id;
pthread_t reader_pthread_id1;
void *res;
pthread_create(&reader_pthread_id1, NULL, reader, NULL);
pthread_create(&writer_pthread_id, NULL, writer, NULL);
pthread_join(writer_pthread_id, &res);
printf("Joined with thread id %lu; return value was %p\n",
writer_pthread_id, (char *)res);
pthread_join(reader_pthread_id1, &res);
printf("Joined with thread id %lu; return value was %p\n",
reader_pthread_id1, (char *)res);
printf("count = %d\n", count);
return 0;
}
As per the output, reader seems not be reading the current value.
bash-3.2$ gcc -g -Wall -pthread -O thread_synchro2_bounded_buffer_spin_lock.c bash-3.2$ ./a.out
Writer count 1 reader count 0
Writer count 2 reader count 1
Writer count 3 reader count 2
Writer count 4 reader count 3
Writer count 5 reader count 4
Joined with thread id 47779619322176; return value was (nil)
Joined with thread id 47779608832320; return value was (nil)
count = 5
However, if I change "count" to "volatile int count" then problem gets fixed.
bash-3.2$ gcc -g -Wall -pthread -O thread_synchro2_bounded_buffer_spin_lock.c
bash-3.2$ ./a.out
Writer count 1 reader count 1
Writer count 2 reader count 2
Writer count 3 reader count 3
Writer count 4 reader count 4
Writer count 5 reader count 5
Joined with thread id 47040774805824; return value was (nil)
Joined with thread id 47040764315968; return value was (nil)
count = 5
Usually, variable inside the critical section does not need to be volatile as it will not be changed asynchronously. Could someone please help me to understand what compiler optimization is causing this problem?
Upvotes: 2
Views: 140
Reputation: 18217
The probable reason for the discrepancy between the volatile int count
and regular int count
is compiler optimization in the reader thread.
Since it needs to evaluate while (count < NUM_COUNT)
in the reader thread, it already has count
in a register, and later doesn't bother to read it from memory again to do printf("reader count %d \n", count);
. When count
is volatile, it has to read it again. Between these statements, the writer thread updates count
.
But multi-threading is tricky and error-prone. Best to use carefully designed and tested idiomatic methods (like atomics library), or to otherwise avoid it altogether and use parallelism in some other part of the calculation.
--- UPDATED ---
Here is a diff of the assembly generated for the two versions of reader thread. It confirms the hypothesis above. For those who parse this, the relevant change are the two lines added above .LVL10
and the rest of the changes are mainly reloads of count:
--- reader.s.nonvolatile 2017-06-16 20:58:26.680644709 +0300
+++ reader.s.volatile 2017-06-16 20:58:29.143644664 +0300
@@ -6,8 +6,8 @@
.cfi_startproc
.LVL8:
.loc 1 29 0
- movl count(%rip), %edx
- cmpl $4, %edx
+ movl count(%rip), %eax
+ cmpl $4, %eax
jg .L14
.loc 1 26 0
subq $8, %rsp
@@ -18,6 +18,8 @@
movl lock(%rip), %eax
testl %eax, %eax
je .L10
+ .loc 1 33 0
+ movl count(%rip), %edx <--- Reads count again!
.LVL10:
.LBB14:
.LBB15:
@@ -36,7 +38,8 @@
.loc 1 34 0 is_stmt 0 discriminator 1
addl $1, %eax
movl %eax, k.3008(%rip)
- cmpl count(%rip), %eax
+ movl count(%rip), %edx
+ cmpl %edx, %eax
je .L11
.loc 1 35 0 is_stmt 1
movl $__PRETTY_FUNCTION__.3012, %ecx
@@ -52,7 +55,7 @@
.loc 1 39 0
movl $0, lock(%rip)
.loc 1 29 0
- movl %eax, %edx
+ movl count(%rip), %eax
cmpl $4, %eax
jle .L10
.loc 1 43 0
Upvotes: 1