Reputation: 712
As a hobby project I'm toying around with creating a programming language with garbage collection. The language will be compiled to (preferably portable) C++ and supports threads.
The question is: Support two threads writes "simultaneously" different values to the same (pointer-sized and aligned) memory location. Is it then possible for any thread to read a mix between the two values?
For example on a 32 bit platform:
Thread 1 writes: AAAAAAAA
Thread 2 writes: BBBBBBBB
Will any thread always read AAAAAAAA or BBBBBBBB or could they read AAAABBBB or some other "mix" between the two? I don't care about the ordering and what ends up being the final value. The important thing is just that no invalid value can ever be read from the location.
I realize that this may depend on the platform and C++ may not provide any promises for it. Would it be guaranteed for some platforms and would that involve using inline assembler to achieve it?
PS: I believe std::atomic would make such guarantees, but I think it would be far to much overhead to use for all load/store operations for object references.
Upvotes: 0
Views: 67
Reputation: 2096
C++ makes no such guarantees, it depends on the hardware. Typical hardware / processors, such as Arm, x86,amd64, as long as the writes are 32-bit aligned then 32-bit read and write operations will be atomic.
reading/writing 32bits a byte at a time (such as with strcpy, memcpy, etc), all bets are off - depends very much on the implementation of those functions (they tend to get lots of optimizations).
It gets more complicated on some platforms when there are multiple memory locations.
Say you have
extern int32 a;
extern int32 b;
a = 0x12345678;
b = 0x87654321;
Now, individually, a and b are written to atomically by thread 1, but an observer, thread 2, may "see" the value of B change before A.
This can happen due to hardware and software. The software (the C++ compiler / optimizer), may rearrange your code if it thinks it would be better. (Or, the compiler may even avoid writing the values to a and b in some cases).
The hardware can also rearrange memory reads/writes at runtime - which is visible when thread1 and thread2 are running on different cores, and until core1 does something to synchronize its internal memory pipeline with the rest of the system, core2 may see something different. The Ia64 is pretty aggressive about these sorts of optimizations. X86 is not so much (as it would break too much legacy code I presume).
In C/C++, "volatile" basically lets you tell the compiler to be less aggressive with optimizations around this variable -- though exactly what it does depends on the implementation. Usually what it means is that the compiler won't optimize away reads/writes to volatile variables and generally won't rearrange accesses to them either.
This doesn't change what the processor might to at runtime though. For that, you need to use the special "memory barrier" intrisics / operations. The details of these are complex, and are usually hidden behind such things as "atomic".
Oh, also, most systems have magical memory -- certain addresses which are reserved by the hardware for special purposes. Typically unless you are writing device drivers you won't run into this.
Upvotes: 1