Reputation: 74
i'm a newbie to Assembly , done some Java/C before.. so it's like learning to crawl after flying.
Im currently learning from 'Tutorial's Point' as it seems really begginer friendly.
I've encountered this code:
MY_TABLE TIMES 10 DW 0 ; Allocates 10 words (2 bytes) each initialized to 0
MOV EBX, [MY_TABLE] ; Effective Address of MY_TABLE in EBX
MOV [EBX], 110 ; MY_TABLE[0] = 110
ADD EBX, 2 ; EBX = EBX +2
MOV [EBX], 123 ; MY_TABLE[1] = 123
Allocating the effective address of MY_TABLE to EBX , kinda means like a pointer to the table's first value? (i know its not comparable to C... but i gotta understand it somehow :))
it says ADD EBX , 2 (which means inc the EBX register by 2 , but then when reffered to it the next line , shouldn't it reffer to the [2] place? again.. thinking of this like C's Arrays.
hope this question is on the spot. I wanna Thank everyone in advance.
Upvotes: 1
Views: 3115
Reputation: 16596
MOV EBX, [MY_TABLE]
will read 4 bytes from the memory at address MY_TABLE
(ie. first two elements of the "array").
By the comment I would guess the author intended to use LEA EBX, [MY_TABLE]
.
LEA
is short for "load effective address", and it is basically the first half of instruction MOV
with memory reference variant. It will calculate the final address, from where such MOV
would load data, but instead of contacting memory chip, and loading data, the address itself is loaded into destination register. Also the same can be achieved in NASM by MOV ebx,MY_TABLE
or in MASM/TASM by MOV ebx,OFFSET MY_TABLE
.
So yes, it's like pointer to first value in table, but to fully appreciate the rawness of Assembly you should keep in mind, it's like pointer to the first BYTE in the table, as pointers in Assembly don't have any type, in 32b protected mode with flat memory model (Win32, linux32) address is uint32_t
, and memory is addressable by bytes, so every such number points somewhere into memory (either where physical RAM chip / device I/O is mapped, or into empty invalid region). That's limiting simple flat-memory 32b mode to 4GiB of memory space (x86 allows for more complex memory mapping schemes in 32b modes, which allows to address more than 4GiB of memory address space, but not with single 32b register address of course).
2) as that pointer points at BYTES, and your table is from WORDS, each word is of 2 bytes size. First element of the table is at offset 0
, but does occupy also next byte at offset 1
. First byte of second value (table[1] in C) is at offset 2
. Thus add ebx,2
is used to get address of second value. add ebx,1
would made you point into the middle of first my_table[0]
value.
Try to not refer to these things like their C counterparts, arrays themselves are sort of low level, so it often does match, but any more high level C constructs often just confuse it more. C arrays already hide from you the pointer arithmetic, using the element type to calculate correct address for you. In assembly that doesn't happen.
Other variants how 32b x86 can address C-like value my_table[1]
:
lea ebx,[my_table]
mov esi,1
mov ax,[ebx + esi*2] ; by index register scaled by 2
lea ebx,[my_table]
mov esi,2
mov ax,[ebx + esi] ; by offset
lea ebx,[my_table]
mov ax,[ebx + 2] ; by immediate offset
mov ax,[my_table + 2] ; by absolute address
Edit: now I did finally notice what you actually do with that ebx
. I consider MOV [EBX], 110
"sloppy" programming style, because from neither argument of such mov
it is clear what is the data size.
If you do mov [ebx],ax
, the size of ax
defines the data width of such operation to be 16 bits, but 110
can be 8, 16 or 32 bit immediate (or even 64b in 64b mode). So the proper/nice style in such situation (memory reference vs immediate) is to explicitly specify the argument size, like:
mov [ebx], word 110 ; C-like: ((short *)my_table)[some_index] = (short)110;
And for performance reasons on modern x86 it is better to avoid using 16 bit parts of registers, so when you are dealing with array of shorts or bytes, it may be better to load them by extending them into full 32 bit value:
movzx eax,word [ebx] ; to extend word value [ebx] with zero bits
movsx ecx,byte [esi] ; to sign-extend value [ebx]
; the top most (sign) bit of original value is copied up to fill upper ecx
Then you can do all your calculations with 32b variant of register (eax
and ecx
in the example above), and use the partial ax
and cl
only at the end of calculation to store the proper result (of course make sure that the calculation itself is not producing wrong results due to 32 bit register/values usage).
Upvotes: 3