Dandan
Dandan

Reputation: 529

A legitimate use case for volatile in C++?

I'm running a few threads that basically are all returning the same object as a result. Then I wait for all of them to complete, and basically read the results. To avoid needing synchronization, I figured I could just pre-allocate all the result objects in an array or vector and give the threads a pointer to each. At a high level, the code is something like this (simplified):

std::vector<Foo> results(2);
RunThread1(&results[0]);
RunThread2(&results[1]);
WaitForAll();
// Read results
cout << results[0].name << results[1].name;

Basically I'd like to know if there's anything potentially unsafe about this "code". One thing I was wondering is whether the vector should declared volatile such that the reads at the end aren't optimized and output an incorrect value.

Upvotes: 1

Views: 139

Answers (3)

David Schwartz
David Schwartz

Reputation: 182743

The short answer to your question is no, the array should not be declared volatile. For two simple reasons:

  1. Using volatile is not necessary. Every sane multithreading platform provides synchronization primitives with well-defined semantics. If you use them, you don't need volatile.

  2. Using volatile isn't sufficient. Since volatile doesn't have defined multithread semantics on any platform you are likely to use, it alone is not enough to provide synchronization.

Most likely, whatever you do in WaitForAll will be sufficient. For example, if it uses an event, mutex, condition variable, or almost anything like that, it will have defined multithreading semantics that are sufficient to make this safe.

Update: "Just out curiosity, what would be an example of something that happens in the WaitForAll that guarantees safety of the read? Wouldn't it need to effectively tell the compiler somehow to "flush" the cache or avoid optimizations of subsequent read operations?"

Well, if you're using pthreads, then if it uses pthread_join that would be an example that guarantees safety of the read because the documentation says that anything the thread does is visible to a thread that joins it after pthread_join returns.

How it does it is an implementation detail. In practice, on modern systems, there are no caches to flush nor are there any optimizations of subsequent reads that are possible but need to be avoided.

Consider if somewhere deep inside WaitForAll, there's a call to pthread_join. Generally, you simply don't let the compiler see into the internals of pthread_join and thus the compiler has to assume that pthread_join might do anything that another thread could do. So keeping information that another thread might modify in a register across a call to pthread_join would be illegal because pthread_join itself might access or modify that data.

Upvotes: 3

Jeffrey
Jeffrey

Reputation: 11400

It depends on what's in WaitForAll(). If it's a proper synchronization, all is good. Mutexes, for example, or thread join, will result in the proper memory synchronization being put in.

Volatile would not help. It may prevent compiler optimizations, but would not affect anything happening at the CPU level, like caches not being updated. Use proper synchronization, like mutexes, thread join, and then the result will be valid (sequentially coherent). Don't count on silver bullet volatile. Compilers and CPUs are now complex enough that it won't be guaranteed.

Other answers will elaborate on the memory fences and other instructions that the synchronization will put in. :-)

Upvotes: 0

eerorika
eerorika

Reputation: 238291

I was wondering is whether the vector should declared volatile such that the reads at the end aren't optimized and output an incorrect value.

No. If there was problem with lack of synchronisation, then volatile would not help.

But there is no problem with lack of synchronisation, since according to your description, you don't access the same object from multiple threads - until you've waited for the threads to complete, which is something that synchronises the threads.

There is a potential problem that if the objects are small (less than about 64 bytes; depending on CPU architecture), then the objects in the array share a cache "line", and access to them may become effectively synchronised due to write contention. This is only a problem if the threads write to the variable a lot in relation to operations that don't access the output object.

Upvotes: 1

Related Questions