Reputation: 2399
mystery has this function signature:
int mystery(char *, int);
This is the mystery function assembly code:
mystery:
movl $0, %eax ;set eax to 0
leaq (%rdi, %rsi), %rcx ; rcx = rdi + rsi
loop:
cmpq %rdi, %rcx
jle endl
decq %rcx
cmpb $0x65, (%rcx)
jne loop
incl %eax
jmp loop
endl:
ret
What does this line cmpq %rdi, %rcx
compare? The address or the character value? If it is comparing the address stored inside the registers, what's the point though? If one address is greater than the other, so?
Upvotes: 0
Views: 4007
Reputation: 364438
Looks like memrchr
, with the cmpq
checking for the search position getting back to the start of the buffer, and the cmpb
checking for a matching byte.
cmp
just sets FLAGS according to dst - src
, exactly like sub
. So it compares its input operands, of course. In this case they're both qword registers holding pointers.
I wouldn't recommend jle
for address comparison; better to treat addresses as unsigned. Although for x86-64 it doesn't actually matter; you can't have an array that spans the signed-overflow boundary because the non-canonical "hole" is there. Should pointer comparisons be signed or unsigned in 64-bit x86?
Still, jbe
would make more sense. Unless you actually have arrays that span across the boundary from the highest address to the lowest address, so the pointer wraps from 0xfff...fff
to 0
. But anyway, you could fix this bug by doing if (p == start) break
instead of p <= start
.
There is a bug in this function though, assuming it's written for the x86-64 System V ABI: its signature takes an int
size arg, but it assumes its sign-extended to pointer width when it does char *endp = start + len
.
The ABI allows narrow args to have garbage in the high bits of their register. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
There are also major performance problems with this: checking 1 byte at a time is total garbage vs. SSE2 16 bytes at a time. Also, it doesn't use either conditional branch as the loop branch, so it has 3 jumps per iteration instead of 2. i.e. an extra not-taken conditional branch.
Also, it pointer-subtract after the loop instead of wasting an inc %eax
inside the loop. If you're going to do inc %eax
inside the loop, you might as well check the size against it instead of the pointer compare.
Anyway, the function is written to be easy to reverse engineer, not to be efficient. The jmp
as well as 2 conditional branches makes it worse for that IMO, vs. an idiomatic loop with a condition at the bottom.
Upvotes: 2
Reputation: 2809
it seems it's doing like this:
char* buff = "abcdef" //this is the rdi.
int64_t len = strlen(buff); //this is the rsi.
for(char* pRCX = buf+len; pRCX >= buff/*this is the cmpq*/; pRCX--){
//do something.
}
the cmpq
in the code checks if rcx reach the start of the array of data. it decreases on every loop because it started on the last item in the array.
yes, cmpq %rdi, %rcx
compares the address. it seems the optimize version of looping through array of characters. instead of looping through index, it directly loop through the address. it's faster this way but a little hard to grasp specially for beginners.
also, i think i read it on agner's books, that looping through series of data starting from the last item and accessing in decreasing order is faster than in increasing order which is typical when coding a loop.
Upvotes: 3