Reputation: 11
Threads A and B are executing concurrently. Which ARMv8-A memory barrier types (like DMB, DSB) are sufficient to fulfill the postcondition, and why?
Initially x1 = 0, x2 = 0
Thread A | Thread B
----------------------------------
x1 = 1 | x2 = 1
barrier | barrier
y1 = x2 | y2 = x1
Postcondition: (y1 == 1) || (y2 == 1)
I looked at the ARMv8-A Architecture Reference Manual memory model definition of DMB and DSB, but could not deduce an argument why the postcondition would hold even with the DSB memory barrier. I think the key definitions in the Architecture Reference Manual are:
The DMB instruction ensures that all affected memory accesses by the PE executing the DMB that appear in program order before the DMB and those which originate from a different PE [...] which have been Observed-by the PE before the DMB is executed, are Observed-by each PE [...] before any affected memory accesses that appear in program order after the DMB are Observed-by that PE.
and
A DSB executed by a PE [...] completes when all of the following apply:
All explicit memory accesses of the required access types appearing in program order before the DSB are complete for the set of observers in the required shareability domain.
[...]
and
In addition, no instruction that appears in program order after the DSB instruction can alter any state of the system or perform any part of its functionality until the DSB completes other than [...]
Upvotes: 1
Views: 1093
Reputation: 22430
Unix smurf wrote a series on ARM memory barriers. The DSB
is a super-set of the DMB
or in other words the DSB
is more restrictive. The DMB
is sufficient to ensure that the writes to x1
and x2
are complete before y1
or y2
is updated using normal memory. Ie, it is a sufficient substitute for barrier in your example on most ARM systems with DSB
also working.
An OS can use different properties in the MMU tables and this could affect your results. For instance using graphics RAM or some network devices buffer as the backing store for x1
, x2
, may not need to issue a dmb
or the dmb
may need different parameters as these type of memory maybe put in a different domain.
In fact, an OS can probably completely subvert these mechanisms. This won't be a factor for most use cases and I just state it to be complete. It is also possible to have AMP (asymmetric multi-processor) systems where this won't work. If you have a system with a ARMv8 and a Cortex-M for instance.
Reference:
Upvotes: 1