What does cmpq compare?

Question

mystery has this function signature:

int mystery(char *, int);

This is the mystery function assembly code:

mystery:
        movl    $0, %eax                ;set eax to 0
        leaq    (%rdi, %rsi), %rcx      ; rcx = rdi + rsi

loop:
        cmpq    %rdi, %rcx
        jle     endl
        decq    %rcx
        cmpb    $0x65, (%rcx)
        jne     loop
        incl    %eax
        jmp     loop

endl:
        ret

What does this line cmpq %rdi, %rcx compare? The address or the character value? If it is comparing the address stored inside the registers, what's the point though? If one address is greater than the other, so?

Peter Cordes · Accepted Answer

Looks like memrchr, with the cmpq checking for the search position getting back to the start of the buffer, and the cmpb checking for a matching byte.

cmp just sets FLAGS according to dst - src, exactly like sub. So it compares its input operands, of course. In this case they're both qword registers holding pointers.

I wouldn't recommend jle for address comparison; better to treat addresses as unsigned. Although for x86-64 it doesn't actually matter; you can't have an array that spans the signed-overflow boundary because the non-canonical "hole" is there. Should pointer comparisons be signed or unsigned in 64-bit x86?

Still, jbe would make more sense. Unless you actually have arrays that span across the boundary from the highest address to the lowest address, so the pointer wraps from 0xfff...fff to 0. But anyway, you could fix this bug by doing if (p == start) break instead of p <= start.

There is a bug in this function though, assuming it's written for the x86-64 System V ABI: its signature takes an int size arg, but it assumes its sign-extended to pointer width when it does char *endp = start + len.

The ABI allows narrow args to have garbage in the high bits of their register. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?

There are also major performance problems with this: checking 1 byte at a time is total garbage vs. SSE2 16 bytes at a time. Also, it doesn't use either conditional branch as the loop branch, so it has 3 jumps per iteration instead of 2. i.e. an extra not-taken conditional branch.

Also, it pointer-subtract after the loop instead of wasting an inc %eax inside the loop. If you're going to do inc %eax inside the loop, you might as well check the size against it instead of the pointer compare.

Anyway, the function is written to be easy to reverse engineer, not to be efficient. The jmp as well as 2 conditional branches makes it worse for that IMO, vs. an idiomatic loop with a condition at the bottom.

What does cmpq compare?

Answers (2)

Related Questions