Reputation: 1622
I'm trying to call x64 assembly function from C++ code with four parameters and the assembly function reset the first parameter to zero every time. Please find the code snippet below.
C++ code: test.cpp
#include <iostream>
extern "C" int IntegerShift_(unsigned int a, unsigned int* a_shl, unsigned int* a_shr, unsigned int count);
int main(int argc, char const *argv[])
{
unsigned int a = 3119, count = 6, a_shl, a_shr;
std::cout << "a value before calling " << a << std::endl;
IntegerShift_(a, &a_shl, &a_shr, count);
std::cout << "a value after calling " << a << std::endl;
return 0;
}
x64 assembly code: test.asm
section .data
section .bss
section .text
global IntegerShift_
IntegerShift_:
;prologue
push rbp
mov rbp, rsp
mov rax, rdi
shl rax, cl
mov [rsi], rax
mov rax, rdi
shr rax, cl
mov [rdx], rax
xor rax,rax
;epilogue
mov rbp, rsp
pop rbp
ret
I'm working on the below environment.
OS - Ubuntu 18.04 64-bit
Assembler - nasm (2.13.02)
C++ compiler - g++ (7.4.0)
processor - Intel® Pentium(R) CPU G3240 @ 3.10GHz × 2
and I'm compiling my code as below
$ nasm -f elf64 -g -F dwarf test.asm
$ g++ -g -o test test.cpp test.o
$ ./test
$ a value before calling 3119
$ a value after calling 0
But if i comment out the line mov [rdx], rax
from assembly function, its not resetting the value of variable a
. I'm new to x64 assembly programming and I couldn't find the relation between rdx register and variable a.
Upvotes: 1
Views: 460
Reputation: 364180
unsigned int* a_shl, unsigned int* a_shr
are pointers to unsigned int
, a 32-bit (dword) type.
You do two qword stores, mov [rsi], rax
and mov [rdx], rax
which store outside of the pointed-to objects.
The C equivalent would be a function that takes unsigned int*
args and does
*(unsigned long)a_shr = a>>count;
. This is of course UB, and behaviour like this (overwriting other variables) is pretty much what you'd expect.
Presumably you compiled with optimization disabled so the caller actually reloaded a
from the stack. And it put a_shr
or a_shl
next to a
in its stack frame, and one of your stores zeroed your caller's copy of a
.
(As usual, gcc happened to zero the upper 32 bits of RDI while it put a
into EDI as the first arg. Writing a 32-bit register zero-extends to the full register. So your other bug; right shifting high garbage into the low 32 bits for a_shr
, didn't bite you with this caller.)
Simpler implementation:
global IntegerShift ; why the trailing underscore? That's weird for no reason.
IntegerShift:
;prologue not needed, we don't even use the stack
; so don't waste instructions making a frame pointer.
mov eax, edi
shl rax, cl ; a<<count
mov [rsi], eax ; 32-bit store
;mov rax, rdi ; we can just destroy our local a, we're done with it
shr edi, cl ; a>>count
mov [rdx], edi ; 32-bit store
xor eax, eax ; return 0
ret
xor eax, eax
is the most efficient way to zero a 64-bit register (no wasted REX prefix). And your return value is only 32-bit anyway because you declared it int
, so it makes no sense to be using 64-bit registers.
BTW, if you had BMI2 available (which you don't on your budget Pentium CPU, unfortunately), you could avoid all the register copying, and be more efficient on Intel CPUs (SHL/RX is only 1 uop instead of 3 for shl/r reg, cl
because of legacy x86 FLAGS-unmodified semantics for the cl=0 case)
shlx eax, edi, ecx
shrx edi, edi, ecx
mov [rsi], eax
mov [rdx], edi
xor eax, eax
ret
Upvotes: 4