Reputation: 72675
I'm implementing a single-producer single-consumer queue, by which one thread waits for the global queue to be filled by another thread like this:
while (queue.head == queue.tail);
When I compiled the program will gcc -O0, it worked well. But when it was compiled with gcc -O1, deadloop happened. Then I looked into assembly code and found that the latter version checked (queue.head == queue.tail) only once, if it was not true, then jumped to a dead loop and never checked again.
I also tried to declare queue as volatile but it didn't work. How to make gcc aware that queue is shared among threads and stop optimizing like that? Many thanks.
P.S.
1 In a single-threaded program, it is OK to optimize like that. But in my program queue.tail can be modified by another thread.
2 My queue was declared like this:
typedef struct {
struct my_data data[MAX_QUEUE_LEN];
int head;
int tail;
} my_queue_t;
volatile my_queue_t queue;
3 I've also tried to declare head and tail (but not the whole struct) as volatile, it didn't work. But after I declare queue, head, tail all as volatile, it works. So is volatile should be declared to all the related variables like this?
Upvotes: 1
Views: 1663
Reputation:
I compiled the following code:
struct my_data {
int x;
};
typedef struct {
struct my_data data[5];
int head;
int tail;
} my_queue_t;
volatile my_queue_t queue;
int main() {
while (queue.head == queue.tail);
}
with :
g++ -S -c -O1 th.cpp
which (for the while loop) produced the folowing output:
movl $_queue+20, %edx
movl $_queue+24, %eax
L2:
movl (%edx), %ebx
movl (%eax), %ecx
cmpl %ecx, %ebx
je L2
where the head and tail are loaded & tested inside the loop. Could you post what assembler you are getting emitted?
Edit: Making head and tail volatile in the struct declaration, rather than declaring the struct instance volatile, resulted in identical code.
Upvotes: 4