Reputation: 61
I have an assembly program which writes the length of a string in ax register but I am a little bit confused about some instructions.
include \masm32\include64\masm64rt.inc
.data
string db "mama", 0 ; so here I declared a string "mama". What happens in memory?
.code
main proc
xor ax, ax ; here I initialize ax with 0.
lea rsi, string ; here, I move in rsi register the adress of string, right? But how the string is stored in memory? Now in rsi I have the adress of the first char "m" of "mama" string?
@@: ; this sign creates an anonymous label
cmp byte ptr [rsi], 0 ; so this says compare 0 with with 1 byte found at the adress pointed to by rsi, right? But I still don't get it. Why 1 byte? rsi is pointing to the first char or to the whole string?
jz @F ; jump if zero to the nearest @@ (forward)
inc ax ; so now i'm pointing to the first character so ax=1
inc rsi ; here what happen? The pointer is incremented to point to the second char from string?
jmp @B ; jump to the nearest @@ (backward)
@@:
invoke ExitProcess, 0 ; invoke ExitProcess API
ret
main endp
end
My confusion is that I'm not sure if I think about how this program works in a right way. Am I thinking this correctly?
Upvotes: 0
Views: 474
Reputation: 1482
string db "mama", 0
4 bytes 0x6d 0x61 0x6d 0x61
('mama') are stored somewhere in the data segment of the program's memory. string
stores the first byte's address in the data segment i.e. 'm'.
xor ax, ax
lea rsi, string
I believe the operation should be lea rsi, [string]
.
(EDIT: As Peter Cordes mentioned in the comment below, in MASM assembler this syntax is fine)
string
points to the address of first char. Now rsi
points to the same address.
@@: ; this sign creates an anonymous label
cmp byte ptr [rsi], 0
rsi points to the beginning of the whole string. The compare operation compares one byte at rsi with zero. If it is zero, it assumes the end of the string and jumps to exit:
jz @F ; jump if zero to the nearest @@ (forward)
If value at rsi isn't zero:
inc ax
Remember we are storing the length of the string in ax. So for each valid character, we are incrementing ax
by 1.
inc rsi
jmp @B ; jump to the nearest @@ (backward)
Point rsi
to the next character ('a') and jump to @@. The code after the first @@ will again check if the next char ('a') is zero, and increment the count (ax
) by 1, so ax
will become 2. This will continue until it reaches a 0, where the program assumes end of the string.
@@:
invoke ExitProcess, 0 ; invoke ExitProcess API
ret
main endp
end
Exit code.
Sidenote: You can use a program like gdb with breakpoint at the beginning to walk through each step. With info registers
command you can check their values. Ask google for more advanced commands/methods.
Upvotes: 1
Reputation: 26646
inc ax
increments the 16-bit value by 1 — the processor doesn't know what ax
is used for but it does know this is a 16-bit addition.
inc rsi
increments the 64-bit value by 1 — the processor doesn't "know" whether this is an integer or pointer, but it does know this is a 64-bit addition.
This program (though just a main
and no function) is similar to a strlen
function:
short strlen ( char *p ) {
short count = 0;
while ( *p != '\0' ) {
count++; // increment 16-bit counter
p++; // increment 64-bit pointer by adding 1 to it
}
return count;
}
The processor treats pointers like integers. Integers that when dereferenced refer to memory locations. Incrementing a pointer by 1 in assembly language makes it point to the next byte of memory — memory locations have integer numbered addresses, so two consecutive memory addresses will differ by the value 1.
Note that due to use of short, 16-bit data type for the length counter, this program will have an overflow error if the string's actual length is >= 32768 (if short
, which is signed is understood) or >= 65536 (if unsigned short
were used instead of short
).
Upvotes: 1