doron
doron

Reputation: 28892

Compare and Exchange on Android (ARM)

The code below is the ARM implementation of compare and exchange on android:

__ATOMIC_INLINE__ int __bionic_cmpxchg(int32_t old_value, int32_t new_value, volatile int32_t* ptr) {
  int32_t prev, status;
  do {
    __asm__ __volatile__ (
          "ldrex %0, [%3]\n"
          "mov %1, #0\n"
          "teq %0, %4\n"
#ifdef __thumb2__
          "it eq\n"
#endif
          "strexeq %1, %5, [%3]"
          : "=&r" (prev), "=&r" (status), "+m"(*ptr)
          : "r" (ptr), "Ir" (old_value), "r" (new_value)
          : "cc");
  } while (__builtin_expect(status != 0, 0));
  return prev != old_value;
}

Does then strexeq clear the monitor set in the ldrex even if the condition is not equal, and if not, how is this safe?

Also why do we need extra it eq for thumb2?

Upvotes: 1

Views: 1119

Answers (1)

Notlikethat
Notlikethat

Reputation: 20934

Does the strexeq clear the monitor set in the ldrex even if the condition is not equal?

No. Nor does it need to - this is the "cmp" part of the cmpxchg - if the value loaded isn't the expected one, then the teq gives the ne condition, nothing happens, we fall out of the loop due to the mov %1, #0, return, and everyone forgets about the whole thing.

If the value loaded was the right one, then we try the conditional strex to exchange it.

All ldrex does is set a flag (the exclusive monitor) to say "nobody has touched this area of memory since my ldrex". If anybody then writes to that area, the flag is cleared. A strex will succeed if and only if it finds the flag is still set. If it finds the flag cleared, that means the loaded value may have changed in memory, which violates the atomicity of the operation, so the store fails and no update occurs. In this case, we have to go right back to the beginning and try again from scratch - eventually, we'll get through the whole sequence without interruption, at which point it will appear to have been an atomic update.

There is no need to worry about the exclusive monitor state in either case; any later exclusive code will by definition start with an ldrex, and that will initialise the monitors appropriately at that point.

Also why do we need extra it eq for thumb2?

Because Thumb doesn't have conditional execution (except branches), thus there are no bits to embed a condition code in an instruction encoding. Thumb-2 introduced the it instruction as a way to make a block of up to 4 subsequent instructions predicated on a particular condition (or its opposite) via the global ITSTATE. Whilst some assemblers are clever enough to automatically generate appropriate it blocks when assembling ARM code for Thumb-2, it's not something you can necessarily rely on in portable code.

A well-behaved assembler should ignore it when assembling for ARM (but still error if it doesn't match the conditions on the following instructions), but it's probably preprocessor-ed out here for the benefit of stupid compilers which guess the length of an inline asm block by counting newlines.

Upvotes: 3

Related Questions