Reputation: 7667
I was playing with one of the examples in C++ Concurrency in Action which uses std::memory_order_relaxed
for reading and writing 3 atomic variables from 5 different threads. The example program is as follows:
#include <thread>
#include <atomic>
#include <iostream>
std::atomic<int> x(0);
std::atomic<int> y(0);
std::atomic<int> z(0);
std::atomic<bool> go(false);
const unsigned int loop_count = 10;
struct read_values
{
int x;
int y;
int z;
};
read_values values1[loop_count];
read_values values2[loop_count];
read_values values3[loop_count];
read_values values4[loop_count];
read_values values5[loop_count];
void increment( std::atomic<int>* v, read_values* values )
{
while (!go)
std::this_thread::yield();
for (unsigned i=0;i<loop_count;++i)
{
values[i].x=x.load( std::memory_order_relaxed );
values[i].y=y.load( std::memory_order_relaxed );
values[i].z=z.load( std::memory_order_relaxed );
v->store( i+1, std::memory_order_relaxed );
std::this_thread::yield();
}
}
void read_vals( read_values* values )
{
while (!go)
std::this_thread::yield();
for (unsigned i=0;i<loop_count;++i)
{
values[i].x=x.load( std::memory_order_relaxed );
values[i].y=y.load( std::memory_order_relaxed );
values[i].z=z.load( std::memory_order_relaxed );
std::this_thread::yield();
}
}
void print( read_values* values )
{
for (unsigned i=0;i<loop_count;++i)
{
if (i)
std::cout << ",";
std::cout << "(" << values[i].x <<","
<< values[i].y <<","
<< values[i].z <<")";
}
std::cout << std::endl;
}
int main()
{
std::thread t1( increment, &x, values1);
std::thread t2( increment, &y, values2);
std::thread t3( increment, &z, values3);
std::thread t4( read_vals, values4);
std::thread t5( read_vals, values5);
go = true;
t5.join();
t4.join();
t3.join();
t2.join();
t1.join();
print( values1 );
print( values2 );
print( values3 );
print( values4 );
print( values5 );
return 0;
}
Every time I run the program I get exactly the same output:
(0,10,10),(1,10,10),(2,10,10),(3,10,10),(4,10,10),(5,10,10),(6,10,10),(7,10,10),(8,10,10),(9,10,10)
(0,0,1),(0,1,2),(0,2,3),(0,3,4),(0,4,5),(0,5,6),(0,6,7),(0,7,8),(0,8,9),(0,9,10)
(0,0,0),(0,1,1),(0,2,2),(0,3,3),(0,4,4),(0,5,5),(0,6,6),(0,7,7),(0,8,8),(0,9,9)
(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0)
(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0),(0,0,0)
If I change from std::memory_order_relaxed
to std::memory_order_seq_cst
the program gives exactly the same output!
I would have expected different output from the 2 versions of the program. Why is there no difference between the output for std::memory_order_relaxed
and std::memory_order_seq_cst
?
Why does std::memory_order_relaxed
always produce exactly the same results for every run of the program?
I am using: - 32bit Ubuntu installed as a virtual machine (under VMWare) - An INtel Quad Core processor - GCC 4.6.1-9
The code is compiled with: g++ --std=c++0x -g mem-order-relaxed.cpp -o relaxed -pthread
Note the -pthread is necessary, otherwise the following error is reported: terminate called after throwing an instance of 'std::system_error' what(): Operation not permitted
Is the behaviour I am seeing due to lack of support with GCC, or as a result of running under VMWare?
Upvotes: 3
Views: 1721
Reputation: 300
Many versions of GCC ignore the memory ordering that you provide and replace it with sequential consistency. You can see this in the header files. Hopefully, they'll eventually have a better implementation? You can play around the effects of relaxed vs. seq_cst by using CDSChecker...
Upvotes: 0
Reputation: 10863
Your use of yield is causing your program's performance to be more dependent on your platform's scheduler than anything else.
That being said, memory_order_relaxed does not demand the compiler reorder the atomics, it merely allows the compiler to do so. If the compiler is happy with the ordering it gets with memory_order_seq_cst, then it may in fact yield the exact same bytecode! This is especially true on x86 because the instruction set already offers so many ordering guarantees, so it isn't as much of a leap to arrive at memory_order_seq_cst.
Upvotes: 2
Reputation: 340168
How many processor cores do you have assigned to the VM? Assign multiple cores to the VM to let it take advantage of concurrency.
Upvotes: 6