Reputation: 225
To implement efficient spinlocks in the VM enviroment, KVM documentation says that a vcpu waiting for spinlock can execute halt instruction and let the spinlock holder vcpu get chance for execution, this spinlock holder vcpu can then execute KVM_HC_KICK_CPU hypercall and awake the waiting vcpu.
Now here is my question:
Imagine below sequence of instructions
CHECK_SPIN_LOCK_FLAG
// <------------ waiting vCPU get scheduled out at exactly before executing hlt
hlt
now, when the spinlock holder vcpu wakes up, releases the spinlock and then tries to wake the cpu, there is nothing to do as cpu is already running. However, when the spinlock waiting cpu get scheduled, it will execute hlt instruction and remain there.
is this a race condition in this hypercall design?
The following is excerpt from hypercall.rst in the Documentation/virt/kvm/x86/hypercalls.rst
5. KVM_HC_KICK_CPU
------------------
:Architecture: x86
:Status: active
:Purpose: Hypercall used to wakeup a vcpu from HLT state
:Usage example:
A vcpu of a paravirtualized guest that is busywaiting in guest
kernel mode for an event to occur (ex: a spinlock to become available) can
execute HLT instruction once it has busy-waited for more than a threshold
time-interval. Execution of HLT instruction would cause the hypervisor to put
the vcpu to sleep until occurrence of an appropriate event. Another vcpu of the
same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
is used in the hypercall for future use.
Upvotes: 0
Views: 113
Reputation: 225
okay, I am not sure whether I solved the problem or not but it seems there is one more hypercall
7. KVM_HC_SCHED_YIELD
---------------------
:Architecture: x86
:Status: active
:Purpose: Hypercall used to yield if the IPI target vCPU is preempted
a0: destination APIC ID
:Usage example: When sending a call-function IPI-many to vCPUs, yield if
any of the IPI target vCPUs was preempted.
We can use above hypercall before waking up the cpu, to make sure the target vcpu was indeed in halted state. If it was preempted, then yield and resume when target is halted rather than preempted.
Solution as of now:
struct mutex_t {
uint64_t intent_cpus;
uint64_t waiting_cpus;
uint64_t m;
}
get_lock () {
// rcx has the cpu id, rdx has the address of mutex
mov $1, %rax
xorq %rbx, %rbx
// set intent for doing lock
lock bts %rcx, (%rdx)
// rdx has mutex address, modifies ZF
lock cmpxchg %rbx, 16(%rdx)
jnz skip_hlt
// Will be waiting for the kick
lock bts %rcx, 8(%rdx)
hlt
skip_hlt:
// reset intent for lock
lock btr (%rdx), %rcx // modifies CF
}
release_lock() { // rdx has mutex address
xor %rax, %rax
lock xchgq %rax, 16(%rdx)
bsf (%rdx), %rcx // rcx has the least significant set bit
jz no_cpus_with_intent
check_for_intent:
bt %rcx, (%rdx)
jnc cpu_does_not_have_intent
bt %rcx, 8(%rdx)
jc yield_once_and_return
/* it has intent, but no waiting bit,
yield so that it can either halt or get lock */
yield_to_vcpu(%rcx)
j check_for_intent
yield_once_and_return:
/* here either it is halted or preempted
just before executing hlt */
yield_to_vcpu(%rcx)
kick_the_vcpu(%rcx)
cpu_does_not_have_intent:
no_cpus_with_intent:
}
Upvotes: 0