Ahtisham
Ahtisham

Reputation: 10126

Is it possible to manipulate the instruction pointer in 8086 assembly?

I want to know if I can manipulate (read and change the value of) the instruction pointer (IP) in 8086 assembly.

For example,

Say IP is currently storing 0200h. I would like to read this value and change it to something else, say 4020h. How could I do that?

Upvotes: 2

Views: 5136

Answers (2)

Peter Cordes
Peter Cordes

Reputation: 365781

Related: Reading program counter directly (I updated the accepted answer there to not suck, and to cover 32 bit vs. 64-bit, because it's the canonical Q&A for reading IP. No mention of writing IP, because that's sort of a conceptual-understanding thing: writing IP is a jump, but it's possible that your code can be running without knowing where it was loaded, so the use-cases are totally different.)

Also a near duplicate: Why can't you set the instruction pointer directly? asks why RIP/EIP/IP isn't exposed directly for use with instructions that work on integer registers like AX. (i.e. why does add IP, AX not work as an indirect jump.) TL:DR: some ISAs like ARM do expose the program counter as one of the integer registers, but x86 has few registers and using one register encoding for IP in the machine code would take away a general-purpose integer register.


You can write IP directly with jmp or call, but you can only read it by having call push it.

(Technically call isn't the only option for reading IP. You could use int or some other interrupt and have the interrupt-handler look at the context before iret, but that's the same idea as call just much more complicated and slower.)


In position-dependent code, the address of every instruction is known at link time. You can use the address of any label as an immediate constant or part of an addressing mode. e.g.

mov ax, $         ; ax = address of the start of the MOV instruction (NASM syntax)

Or

mov  ax, label   ; or MASM:  mov ax, OFFSET label

label:

Say IP is currently storing 0200h i would like to read this value & change it to something else say 4020h. how could i do that ?

call 4020h

The assembler will figure out what rel16 displacement to use given the current IP. (Or you could put 4020h in a register and call ax, if you want a position-independent way to jump to a fixed IP value (offset relative to cs, so still not an absolute address. For that you need a far call, and can use a ptr16:16 absolute direct with the address as an immediate.)

The old value (+ the length of the call instruction) will be on the stack where the code at 4020h could pop it with pop (or pop back into IP with ret), or load it with a mov.


In general, avoid mis-matched call / ret. (i.e. don't just pop the return address into a register and return with a jmp). That will cause branch mispredicts because you unbalance the return-address predictor stack. (http://agner.org/optimize/ and Return address prediction stack buffer vs stack-stored return address?)

On CPUs newer than PIII, call next_insn / pop ax is efficient because call rel32=0 is special-cased and doesn't break the return-address predictor stack. See Reading program counter directly.

@mksteve's suggestion to call a function that does mov bx, [sp] / ret instead of just call next_instruction / pop bx is good on early Intel P6-family CPUs like PPro. But note that [sp] isn't a valid 16-bit addressing mode, so this is extra clunky in 16-bit. Perhaps pop ax / push ax / ret would suck less if you really wanted to do it in 16-bit code.


In 64-bit code, you can read the current value of RIP more directly: lea rax, [rip]. This is much more commonly used for position-independent addressing of static data. e.g. lea rax, [rel my_table] or add dword [rel global_counter], 2 will tell the assembler+linker to figure out what rel32 to use to reach the symbol you wanted. This works within an executable or within a dynamic library, where the distance between the code and the data is constant even if the library is loaded at a different address.

Upvotes: 4

mksteve
mksteve

Reputation: 13085

If you wanted to set the instruction pointer to a known value, say hex value 4020h, you could jump directly to that address:

jmp 4020h

Or if some memory location, myVariable, held the value you wanted to store in IP you could do an indirect jump:

jmp [myVariable]

The result of a jmp (indirect or direct) modifies the instruction pointer.

Reading the instruction pointer is problematic. Position independent code on Linux used to work by using a set of code something like:

 call getIP

with

 :getIP
 mov bx, [sp] ; Read the return address into BX.
 ret

For other methods of reading IP, see Stack Overflow: reading IP.

Upvotes: 8

Related Questions