Difficulties in understand assmbly code of '__atomic_compare_exchange'

Question

I program in C++ and use CAS operation for thread synchronization.

I profiled my program by using Vtune and found that a huge portion of time was spent on CAS operation.

I took a look at the assembly code.

The profiling result shows that the significant portion of time is being spent on 'movq %rax, (%rsi)', but not on 'lock cmpxchgq %rcx, (%rdi)'.

How is 'movq %rax, (%rsi)' opreation related to CAS operation? Which data is being moved by this operation?

1201ProgramAlarm · Accepted Answer

The lock cmpxchgq is taking a long time. When the profiler determines where the program currently is, it sometimes has to wait for an instruction to finish executing before it can find out. This causes the instruction following a long, non-interruptable instruction to be reported as taking up a large amount of time when it is really the previous instruction that is so lengthy.

Difficulties in understand assmbly code of '__atomic_compare_exchange'

Answers (2)

Related Questions

Difficulties in understand assmbly code of &#39;__atomic_compare_exchange&#39;

Answers (2)

Related Questions

Difficulties in understand assmbly code of '__atomic_compare_exchange'