noman pouigt
noman pouigt

Reputation: 976

Is linux' "mutex lock" implemented using "memory barrier"?

I was reading this where Robert Love mentioned that mutex is implemented using memory barrier but I am not able to see the memory barrier instructions being used in Linux implementation of mutex lock.

I was wondering if he was referring to mutex lock implementation in posix library which does use memory barrier instruction so that it doesn't get reordered with respect to the critical resource. Am I right?

Upvotes: 4

Views: 1457

Answers (3)

user339589
user339589

Reputation: 116

There are two problems with this answer: (1) it is specific to x86 and (2) it is obsolete (which will be the common case in the Linux kernel, especially after eight years).

The current code is as follows:

 1 static __always_inline bool __mutex_trylock_fast(struct mutex *lock)
 2 {
 3   unsigned long curr = (unsigned long)current;
 4   unsigned long zero = 0UL;
 5
 6   if (atomic_long_try_cmpxchg_acquire(&lock->owner, &zero, curr))
 7     return true;
 8
 9   return false;
10 }

This is guaranteed to provide only acquire semantics, not a full barrier. Yes, this does provide a full barrier on x86, but that is an accident of implementation. That full barrier is absolutely not guaranteed across all architectures.

Upvotes: 1

Indeed, mutexes need some memory synchronization. The important thing is how to wait for a mutex to be unlocked (by some other thread) without busy spinlocks (in particular, because you don't want a waiting thread to eat a lot of CPU). Read about futex(7). Like clone(2), the futex(2) system call is only useful for implementors of threading libraries.

BTW, both GNU libc & musl-libc are free software implementations of POSIX threads. So study their source code if you want to understand the details.

Upvotes: 0

Tsyvarev
Tsyvarev

Reputation: 65928

That Robert Love's answer is applicable to mutexes in any area.

Implementation in the linux kernel you refers uses __mutex_fastpath_lock, which do most the work and usually implementing using assembler code. E.g., on x86_64 its implementation could be:

 20 static inline void __mutex_fastpath_lock(atomic_t *v,
 21                                          void (*fail_fn)(atomic_t *))
 22 {
 23         asm_volatile_goto(LOCK_PREFIX "   decl %0\n"
 24                           "   jns %l[exit]\n"
 25                           : : "m" (v->counter)
 26                           : "memory", "cc"
 27                           : exit);
 28         fail_fn(v);
 29 exit:
 30         return;
 31 }

The key here is LOCK prefix (LOCK_PREFIX) before dec (decl) operation. On x86 LOCK prefix means atomicity and always implies full memory barrier.

Upvotes: 11

Related Questions