Alexander Zhak
Alexander Zhak

Reputation: 9282

What are 8086 ESC instruction opcodes

Mostly of a historical interest and if I were to implement 8086 compatibility for assembler, what operands are considered valid for ESC instruction?

ESC opcode, source

From 8086 Programmer's manual I know, that opcode is an immediate in range 0 to 63 and source is a register or memory. But what registers can be encoded? Both reg8 and reg16 or only reg16? If source is memory, does operand size (mem8 or mem16) matter?

Basically, both of the above don't really matter from instruction encoding perspective (as, for example, both esc 0x01, ch and esc 0x01, bp would produce the same result), but maybe assemblers had forced restrictions

And, the last, but not the least, where can I find description for ESC opcodes?

Upvotes: 6

Views: 4275

Answers (2)

NinjaDarth
NinjaDarth

Reputation: 393

Opcodes for the 80x86 - like those of the predecessors the 8080 and 8085 - are best understood in octal, not hexadecimal. It's understood as 8 bits per byte with 2+3+3 bits, so the first octal digit ranges over 0-3, while the other two each range over 0-7.

The floating point escape has the octal form 33c xrm D, for Escape #c, with nominal register #r and operand mode (x,m), where D is one byte of the type int8_t for x = 1; 2 bytes of the type uint16_t for x = 2 or for (x,m) = (0,6); and 0 bytes otherwise.

For x = 3, m is nominally a second register: register #m, while for x = 0, 1 or 2 (except from (x,m) = (0,6)), m nominally denotes a combination of indexing registers (BX+SI, BX+DI, BP+SI, BP+DI, SI, DI, BP, BX) for m = (0,1,2,3,4,5,6,7) respectively. The indexing is inapplicable for x = 3 (since m denotes a register, not a memory address) and is 0 or (x,m) = (0,6). The displacement is Disp = 0, if D is 0 bytes, Disp = (int16_t)D, if D is 1 byte; Disp = D, if D is 2 bytes.

Opcodes that are not supported trigger an exception in the CPU, as interrupt 6 - starting with the 80186.

I'm not sure exactly what handshaking goes on with the 33c opcodes nor what happens with interrupt 6, if anything. There's a hardware issue there, with synchronizing access to the data bus. For this purpose, the WAIT opcode (233) is present in the 8086 to allow the CPU to defer. An 8087 hooked up to the 8086 would do its thing before the 8086 resumed.

But for the 017 opcode and the other opcode-holes in the 8086, exception interrupt 6 gets triggered.

Whoever programs the bare-to-the-metal program for the CPU (by definition: The Firmware) is responsible for writing an exception-handler for interrupt 6 (and for all other interrupts and exceptions). The return address points to the start of the invalid operation, and this is done to give a hook for firmware to implement its own extensions of the opcode set.

There's nothing in the 80x86 language that gives a firmware programmer direct access to the "effective address" interpretation of (x,m,D) or "the register" for r, so they'd have to be explicitly interpreted in the firmware's exception-handler. But I think (x,m,D) drives the data bus at the hardware level, so that part of the task of interpretation is passed on to the hardware engineer. The task of interpreting "r", however, I think still has to be handled in firmware.

By the way, other places where the understanding implied by the octal format gets lost in the translation when people use hexadecimal for the opcodes, include the cases where the octal digit in the middle of an opcode encodes an operation.

What gets lost is that it's the same encoding that appears in various places. For instance, the opcodes 0pq denote operations (add, or, adc, sbb, and, sub, xor, cmp) for p = (0, 1, 2, 3, 4, 5, 6, 7) respectively; with different addressing modes q = (0, 1, 2, 3, 4, 5); cases q = (0, 1, 2, 3) taking on extra xrm D bytes, q = 4 takes on an extra 1-byte uint_8 value, and q = 5 an extra 2-byte uint_16 value.

It's the same "p" that occurs in the opcodes 20c xpm D, for c = (0, 1, 2, 3). So these have no register #r, as the r in xrm is replaced by the operator p. (Instead, they take on extra numeric bytes: 2 bytes for c = 2 and one byte for c = 0, 1 or 3). These are the "p" operations between the register or memory denoted by (x,m,D) and the extra 1 or 2 bytes of numeric data appended to the operation.

When I brought up the issue of the octal legacy the 8086, on the USENET, that prompted the creation of NASM (Netwide Assembler) in direct response, which is why it used - and still uses - octal internally for the opcodes.

Upvotes: 1

fuz
fuz

Reputation: 93127

The 8086 has an opcode space collectively designated ESC (escape to coprocessor). It occupies the range d8 to df. Each instruction in this instruction space is followed by a modr/m byte and depending on the mod-field, zero to two displacement bytes. When the 8086 encounters an ESC instruction with two register operands (i.e. mod = 11), it performs a nop. When the processor encounters an ESC instruction with a memory operand, a read cycle is performed from the address indicated by the memory operand and the result is discarded.

Using two special signal lines, a coprocessor can distinguish data fetches from instruction fetches, allowing it to decode the instruction stream in parallel with the 8086.

This mechanism is used by the 8087 to hook into the instruction stream: The three available bits in the opcode byte together with the three bits of the reg field in the modr/m byte form a six bit opcode. The r/m field of the modr/m byte is used to designate either a location on the FPU register stack (if mod = 11, indicating two register operands) or a memory operand. Some opcodes encode a variety of instructions depending on the content of r/m field. In all these cases, one instruction is encoded for memory operands while eight other instructions are encoded for each possible register operand.

The 8087 registers when the 8086 performs the dummy fetch immediately after fetching the instruction and remembers the address. In case of an instruction that loads from memory, it loads the additional words of the memory operand and performs its function. In case of a store, it ignores the result of the fetch and stores its values to the address indicated by the 8086.

The 8087 performs operation asynchronously to the 8086. However, no implicit synchronization exists. If an attempt is made to issue an ESC instruction while the coprocessor is busy, the instruction is silently ignored. To solve this problem, the 8087 asserts its BUSY pin (connected to the 8086's TEST pin) while performing operation. The programmer can issue a wait instruction (9b, wait for coprocessor ready) to wait until the 8087 has concluded operation and consequently released the WAIT line. This is typically done before every single 8087 instruction and many assemblers of the day would automatically insert WAIT prefixes. For high performance code, it was also common to instead manually calculate how long the 8087 was going to take for a certain instruction and to omit the wait when the previous floating point instruction was guaranteed to have finished.

Upvotes: 15

Related Questions