Reputation: 751
I am trying to understand how the volatile keyword works in C++.
I had a look at What kinds of optimizations does 'volatile' prevent in C++?. Looking at the accepted answer, it looks like volatile
disables two kinds of optimizations
I found similar information at The as-if rule:
Accesses (reads and writes) to volatile objects occur strictly according to the semantics of the expressions in which they occur. In particular, they are not reordered with respect to other volatile accesses on the same thread.
I wrote a simple C++ program that sums all the values in an array to compare the behaviour of plain int
s vs. volatile int
s. Note that the partial sums are not volatile.
The array consists of unqualified int
s.
int foo(const std::array<int, 4>& input)
{
auto sum = 0xD;
for (auto element : input)
{
sum += element;
}
return sum;
}
The array consists of volatile int
s:
int bar(const std::array<volatile int, 4>& input)
{
auto sum = 0xD;
for (auto element : input)
{
sum += element;
}
return sum;
}
When I look at the generated assembly code, SSE registers are used only in the case of plain int
s. From what little I understand, the code using SSE registers is neither optimizing away the reads nor reordering them across each other. The loop is unrolled, so there aren't any branches either. The only reason I can explain why the code generation is different is: can the volatile reads be reordered before the accumulation happens? Clearly, sum
is not volatile. If such reordering is bad, is there a situation/example that can illustrate the issue?
Code generated using Clang 9:
foo(std::array<int, 4ul> const&): # @foo(std::array<int, 4ul> const&)
movdqu (%rdi), %xmm0
pshufd $78, %xmm0, %xmm1 # xmm1 = xmm0[2,3,0,1]
paddd %xmm0, %xmm1
pshufd $229, %xmm1, %xmm0 # xmm0 = xmm1[1,1,2,3]
paddd %xmm1, %xmm0
movd %xmm0, %eax
addl $13, %eax
retq
bar(std::array<int volatile, 4ul> const&): # @bar(std::array<int volatile, 4ul> const&)
movl (%rdi), %eax
addl 4(%rdi), %eax
addl 8(%rdi), %eax
movl 12(%rdi), %ecx
leal (%rcx,%rax), %eax
addl $13, %eax
retq
Upvotes: 4
Views: 11586
Reputation: 81327
The volatile
keyword in C++ was inherited it from C, where it was intended as a general catch-all to indicate places where a compiler should allow for the possibility that reading or writing an object might have side-effects it doesn't know about. Because the kinds of side-effects that could be induced would vary among different platforms, the Standard leaves the question of what allowances to make up to compiler writers' judgments as to how they should best serve their customers.
Microsoft's compilers for the 8088/8086 and later x86 have for decades been designed to support the practice of using volatile
objects to build a mutex which guards "ordinary" objects. As a simple example: if thread 1 does something like:
ordinaryObject = 23;
volatileFlag = 1;
while(volatileFlag)
doOtherStuffWhileWaiting();
useValue(ordinaryObject);
and thread 2 periodically does something like:
if (volatileFlag)
{
ordinaryObject++;
volatileFlag=0;
}
then the accesses to volatileFlag
would serve as a warning to Microsoft's compilers that they should refrain from making assumptions about how any preceding actions on any objects would interact with later actions. This pattern has been followed with the volatile
qualifiers in other languages like C#.
Unfortunately, neither clang nor gcc includes any option to treat volatile
in such a fashion, opting instead to require that programmers use compiler-specific intrinsics to yield the same semantics that Microsoft could achieve using only the Standard keyword volatile
that was intended to be suitable for such purposes [according to the authors of the Standard, "A volatile
object is also an appropriate model for a variable shared among multiple processes."--see http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf p. 76 ll. 25-26]
Upvotes: 8