Tryer
Tryer

Reputation: 4090

Memory addressing mode interpretation for x86 on Linux

I am reading through Programming from the ground up by Jonathan Bartlett. The author discusses memory addressing mode and states that the general form of memory address reference is this:

ADDRESS_OR_OFFSET (%BASE_OR_OFFSET, %INDEX, MULTIPLIER)

where the final address is calculated thus:

FINAL_ADDRESS = ADDRESS_OR_OFFSET + %BASE_OR_OFFSET + MULTIPLIER * %INDEX.

It is also stated that if any of the pieces is left out, it is just substituted with zero in the equation. ADDRESS_OR_OFFSET and MULTIPLIER are required to be constants, while the other elements are required to be registers. These seem to be the only general rules specified.

So far, so good.

The author then discusses indirect addressing mode and gives as example:

movl (%eax), %ebx

which moves the value at address stored in the eax register into ebx register.

For that to work, (%eax) should be interpreted as 0(%eax,0,0) as opposed to 0(0,%eax,0). Is there an additional rule that enforces this interpretation?

Upvotes: 0

Views: 524

Answers (2)

Marcio J
Marcio J

Reputation: 780

I'm also reading this book and noticed that the code examples are slightly different from others you may find on the internet. This is because:

The syntax for assembly language used in this book is known at the AT&T syntax. It is the one supported by the GNU tool chain that comes standard with every Linux distribution. However, the official syntax for x86 assembly language (known as the Intel® syntax) is different.

About the question, I have found more information here:

The base, index and displacement components can be used in any combination, and every component can be omitted; omitted components are excluded from the calculation above. If index register is missing, the pointless scale factor must be omitted as well.

Upvotes: 0

fuz
fuz

Reputation: 93127

The explanation in the book is not 100% correct. The x86 architecture has the following 32 bit addressing modes:

$imm                         immediate     result = imm
%reg                         register      result = reg
disp(%reg)                   indirect      result = MEM[disp + reg]
disp                         direct        result = MEM[disp]
disp(%base, %index, %scale)  SIB           result = MEM[disp + base + index * scale]

In the SIB (scale/index/base) and indirect addressing modes, disp can be left out for a 0 byte displacement. In SIB addressing mode, additionally base and index can be left out for 0 scale, 0 index; scale cannot be left out actually. Note that when I say “leave out,” only the value is left out; the comma is left in. For example, (,,1) means “SIB operand with no displacement, no base, no index, and 1 scale.”

In 64 bit mode, a rip-relative addressing mode is additionally available:

disp(%rip)                   rip relative  result = MEM[disp + rip]

This addressing mode is useful for writing position-independent code.

16 bit modes have different addressing modes, but they don't really matter so I'm not going to elaborate on them.

So for your example: this is understandable easily because it's actually an indirect addressing mode, not a SIB addressing mode with eax as the register and no displacement.

Upvotes: 4

Related Questions