Alex
Alex

Reputation: 876

Assembler memory offsets and segments

So, here we refer to this assembly code as an example:

.data
        hello_world db 9
end

.code
main      proc
          mov  eax, 2
          lea  ebx, hello_world
main      endp
end

Now in an article about assembly I read that, an assembler does the following:

Saves memory offsets as offsets relative to their corresponding segment

Replaces the offsets and segments by a placeholder serving as a relocatable address for the linker

Now, for the first statement what I understand is that:

lea ebx, hello_world

will be replaced by:

lea ebx, ds:[00]

is that right?

For the second statement (considering my first statement is right), I don't really understand. The assembler already replaced the memory offset by an offset relative to its segment, so what does the assembler do with the placeholder for offsets?

Does it just mark the offset it as relocatable in the .obj file in some way, or does it do something else?

Finally, I don't understand the placeholder for segments, is it something that happens upon writing this?

.data
    hello db 1
ends

.code
main     proc
         lea      eax, data ;copy address of data segment in eax
main     endp
end

, or is it something else?

I hope this is clear.

Thanks in advance

Upvotes: 0

Views: 829

Answers (1)

vitsoft
vitsoft

Reputation: 5775

If the definition of memory variable hello_world is the first statement in segment .data, then its offset is 0 at assembly time and you can see in listing that loading it to ebx is assembled to

00000000: 8D1D[00000000]     lea ebx, hello_world

Brackets [] in the listing signalize that the address 00000000 is relocatable and it is marked so in the object file .obj. The loader will assign a different offset for the .data segment, say 0x00402000, at run time. Then it has to increase (relocate) the displacement field in the instruction body by the new virtual address. It will be increased by the difference between the allocated .data address (0x00402000) and the .data address assumed by assembler (0x00000000).
CPU (and you in debugger) will see

00401000: 8D1D[00204000]     lea ebx, hello_world

The relocated double word is not just a placeholder, it is regular field of the instruction lea which is being updated when the program is loaded to memory.

Upvotes: 3

Related Questions