user1520427
user1520427

Reputation: 1365

Calling C++ function from assembly, with struct argument

I'm getting some unexpected behaviour which probably means I don't fully understand what the compiler is doing. Consider the following contrived program:

#include <stdio.h>

#pragma pack(push, 1)
struct A {
    unsigned short a;
    unsigned short b;

    explicit A() {
        printf("construct\n");
    }
    ~A() {
        printf("destruct\n");
    }
};
#pragma pack(pop)
static_assert(sizeof(A) == 4, "sizeof(A) != 4");

A  __stdcall f(int p1, A p2, int p3, int p4) {
    printf("%08X %08X %08X %08X\n", p1, p2, p3, p4);
    return p2;
}

int main() {
    __asm {
        push 4
        push 3
        push 2
        push 1
        call f

    }
    return 0;
}

The above program will crash, but if I remove the definitions of A() and ~A() from struct A it won't. The issue is related to where the compiler thinks the arguments are on the stack, with the constructor defined it thinks they're 4 bytes further than where they are. If I remove the constructors the output I get is this:

00000001 00000002 00000003 00000004

Which is what I expected, however with the constructors defined I get

00000002 00000003 00000004 00000000

Which is obviously not what I expected. When running the former the function returns with RETN 0x10 and the latter with RETN 0x14, so it looks like it thinks there should be another parameter (why?). I noticed that if I change f to be a void function, it works as expected. So, can someone explain to me what's going on and why? I have all optimizations turned off.

Upvotes: 3

Views: 565

Answers (1)

Vaughn Cato
Vaughn Cato

Reputation: 64308

At the assuembly level, only simple values can be returned from a function by returning them in a register, so if a more complex object needs to be returned, the compiler will treat it as if you are passing a pointer to the returned object:

void f(A *return_ptr,int p1,A p2,int p3,int p4);

Certain optimizations can be made however. In your example, your class contains two 16-bit shorts, and those two 16-bit shorts can be packed into a single 32-bit integer and returned in a register. However, if you define a custom destructor, the class is no longer considered simple enough to do this optimization.

Upvotes: 6

Related Questions