C++ variable reset to 0 after calling x64 assembly function

Question

I'm trying to call x64 assembly function from C++ code with four parameters and the assembly function reset the first parameter to zero every time. Please find the code snippet below.

C++ code: test.cpp

#include 

extern "C" int IntegerShift_(unsigned int a, unsigned int* a_shl, unsigned int* a_shr, unsigned int count);

int main(int argc, char const *argv[])
{
    unsigned int a = 3119, count = 6, a_shl, a_shr;
    std::cout << "a value before calling " << a << std::endl;
    IntegerShift_(a, &a_shl, &a_shr, count);
    std::cout << "a value after calling " << a << std::endl;
    return 0;
}

x64 assembly code: test.asm

section .data
section .bss
section .text

global IntegerShift_
    IntegerShift_:
        ;prologue
        push rbp
        mov rbp, rsp

        mov rax, rdi
        shl rax, cl
        mov [rsi], rax
        mov rax, rdi
        shr rax, cl
        mov [rdx], rax
        xor rax,rax

        ;epilogue
        mov rbp, rsp
        pop rbp
        ret

I'm working on the below environment.

OS - Ubuntu 18.04 64-bit
Assembler - nasm (2.13.02)
C++ compiler - g++ (7.4.0)
processor - Intel® Pentium(R) CPU G3240 @ 3.10GHz × 2

and I'm compiling my code as below

$ nasm -f elf64 -g -F dwarf test.asm
$ g++ -g -o test test.cpp test.o
$ ./test
$ a value before calling 3119
$ a value after calling 0

But if i comment out the line mov [rdx], rax from assembly function, its not resetting the value of variable a. I'm new to x64 assembly programming and I couldn't find the relation between rdx register and variable a.

Peter Cordes · Accepted Answer

unsigned int* a_shl, unsigned int* a_shr are pointers to unsigned int, a 32-bit (dword) type.

You do two qword stores, mov [rsi], rax and mov [rdx], rax which store outside of the pointed-to objects.

The C equivalent would be a function that takes unsigned int* args and does
*(unsigned long)a_shr = a>>count;. This is of course UB, and behaviour like this (overwriting other variables) is pretty much what you'd expect.

Presumably you compiled with optimization disabled so the caller actually reloaded a from the stack. And it put a_shr or a_shl next to a in its stack frame, and one of your stores zeroed your caller's copy of a.

(As usual, gcc happened to zero the upper 32 bits of RDI while it put a into EDI as the first arg. Writing a 32-bit register zero-extends to the full register. So your other bug; right shifting high garbage into the low 32 bits for a_shr, didn't bite you with this caller.)

Simpler implementation:

global IntegerShift    ; why the trailing underscore?  That's weird for no reason.
    IntegerShift:
        ;prologue not needed, we don't even use the stack
        ; so don't waste instructions making a frame pointer.

        mov   eax, edi
        shl   rax, cl              ; a<>count
        mov   [rdx], edi           ; 32-bit store

        xor   eax, eax             ; return 0
        ret

xor eax, eax is the most efficient way to zero a 64-bit register (no wasted REX prefix). And your return value is only 32-bit anyway because you declared it int, so it makes no sense to be using 64-bit registers.

BTW, if you had BMI2 available (which you don't on your budget Pentium CPU, unfortunately), you could avoid all the register copying, and be more efficient on Intel CPUs (SHL/RX is only 1 uop instead of 3 for shl/r reg, cl because of legacy x86 FLAGS-unmodified semantics for the cl=0 case)

    shlx   eax, edi, ecx
    shrx   edi, edi, ecx
    mov   [rsi], eax
    mov   [rdx], edi
    xor   eax, eax
    ret

C++ variable reset to 0 after calling x64 assembly function

Answers (1)

Related Questions