Reputation: 827

binary translation

The VMM traps privileged instructions and they are translated using binary translation, but actually into what are these special instructions translated into?

Thanks

Upvotes: 16

Answers (3)

makes

Reputation: 6548

See VMware_paravirtualization.pdf, pages 3 and 4.

This approach, depicted in Figure 5, translates kernel code to replace nonvirtualizable instructions with new sequences of instructions that have the intended effect on the virtual hardware.

So the privileged instructions are translated into other instructions, which access the virtual BIOS, memory management, and devices provided by the Virtual Machine Monitor, instead of executing directly on the real hardware.

Exactly what these instructions are, is defined by the VM implementation. Vendors of proprietary virtualization software don't necessarily publish their binary translation techniques.

Upvotes: 18

Jacob Miller

Reputation: 37

Guest kernel instructions are translated into safe to execute instructions that keep the virtualization of the guest intact. It is a complex process. Exact algorithms vary per hypervisor. VMware Research invented the technique in 1998. A research paper about related algorithmic models and sample code written by a couple of VMware researchers can be found on the Wayback Machine.

Our software VMM uses a translator with these properties:

• Binary. Input is binary x86 code, not source code.

• Dynamic. Translation happens at runtime, interleaved with execution of the generated code.

• On demand. Code is translated only when it is about to execute. This laziness side-steps the problem of telling code and data apart.

• System level. The translator makes no assumptions about the guest code. Rules are set by the x86 ISA, not by a higher-level ABI. In contrast, an application-level translator like Dynamo [4] might assume that “return addresses are always produced by calls” to generate faster code. The VMM does not: it must run a buffer overflow that clobbers a return address precisely as it would have run natively (producing the same hex numbers in the resulting error message).

• Subsetting. The translator’s input is the full x86 instruction set, including all privileged instructions; output is a safe subset (mostly user-mode instructions).

• Adaptive. Translated code is adjusted in response to guest behavior changes to improve overall efficiency.

The first TU in our example is:
isPrime: mov %ecx, %edi
mov %esi, $2
cmp %esi, %ecx
jge prime
Translating from x86 to x86 subset, most code can be translated IDENT (for “identically”). The first three instructions above are IDENT. jge must be non-IDENT since translation does not preserve code layout. Instead, we turn it into two translator-invoking continuations, one for each of the successors (fall-through and taken-branch), yielding this translation (square brackets indicate continuations):
isPrime’: mov %ecx, %edi ; IDENT
mov %esi, $2
cmp %esi, %ecx
jge [takenAddr] ; JCC
jmp [fallthrAddr]

While most instructions can be translated IDENT, there are several noteworthy exceptions:

• PC-relative addressing cannot be translated IDENT since the translator output resides at a different address than the input. The translator inserts compensation code to ensure correct addressing. The net effect is a small code expansion and slowdown.

• Direct control flow. Since code layout changes during translation, control flow must be reconnected in the TC. For direct calls, branches and jumps, the translator can do the mapping from guest address to TC address. The net slowdown is insignificant.

• Indirect control flow (jmp, call, ret) does not go to a fixed target, preventing translation-time binding. Instead, the translated target must be computed dynamically, e.g., with a hash table lookup. The resulting overhead varies by workload but is typically a single-digit percentage.

• Privileged instructions. We use in-TC sequences for simple operations. These may run faster than native: e.g., cli (clear interrupts) on a Pentium 4 takes 60 cycles whereas the translation runs in a handful of cycles (“vcpu.flags.IF:=0”). Complex operations like context switches call out to the runtime, causing measurable overhead due both to the callout and the emulation work.

Finally, although the details are beyond the scope of this paper, we observe that BT is not required for safe execution of most user code on most guest operating systems. By switching guest execution between BT mode and direct execution as the guest switches between kernel- and user-mode, we can limit BT overheads to kernel code and permit application code to run at native speed.

Upvotes: 1

Raj

Reputation: 887

Binary translation is a system virtualization technique.

The sensitive instructions in the binary of Guest OS are replaced by either Hypervisor calls which safely handle such sensitive instructions or by some undefined opcodes which result in a CPU trap. Such a CPU trap is handled by the Hypervisor.

On most modern CPUs, context sensitive instructions are Non-Virtualizable. Binary translation is a technique to overcome this limitation.

For example, if the Guest had wanted to modify/read the CPUs Processor Status Word containing important flags/control bitfields, the Host program would scan the guest binary for such instructions and replace them with either a call to hypervisor or some dummy opcode.

Para-Virtualization on the other hand is a technique where the source code of the guest os is modified. All system resource access related code is modified with Hypervisor APIs.

Upvotes: 27

binary translation

Answers (3)

Related Questions