Reputation: 13116
When I use the x86_64 CAS-instruction LOCK CMPXCHG
, i.e. while atomic (reads value, compares and writes the result back), at this time what is locked:
Is this true, that x86_64 Intel CPU uses?
Upvotes: 3
Views: 620
Reputation: 19706
Neither is accurate. The second is similar to what actually happens on a bus-lock, which in modern x86 CPUs is a (hopefully) rare and pathological case when a regular lock can't work. It used to be common on the old 486 / early Pentiums, but on the newer products the common case is much simpler - you lock the line in the cache, but since you want to do the read-modify-write as fast as possible - there's also no sense in doing this in the L3. Instead, you'll choose the closest cache to the operating core - probably the L1 or some equivalent internal structure.
You can guarantee that the atomic RMW is done safely in the cache even with a simple MESI - you first get ownership of the line (like any normal write would need to), then you can do the atomic flow when you know for sure that no other core has this line. The only problem is that snoops may in theory come in the middle, so the solution is usually to simply block snoops for this line until the RMW is done. However, there's no problem with allowing any other activity during that period (such as other requests coming out of the same core, or snoops coming in. The only other limitation is regarding memory ordering, but that's usually handled in the memory unit (where there's still a notion of order) and not at the cache.
See also the manual section in this answer - x86 LOCK question on multi-core CPUs
Upvotes: 7