Reputation: 763
I have some simple code that adds two size_t
s together:
#include <stdlib.h>
extern "C" __declspec(dllexport) size_t _cdecl add(size_t x, size_t y)
{
return x + y;
}
(Note: this code is compiled and run on a 64-bit system.)
When calling that function via Python's ctypes
and passing it arguments of type c_uint
(32 bits in size instead of 64), the function works as expected:
import ctypes
lib = ctypes.cdll['./ctypetest.dll']
add = lib.add
add.restype = ctypes.c_uint
add.argtypes = [ctypes.c_uint, ctypes.c_uint]
add(1, 2) # = 3
As a sanity check, I verified that both uint
and size_t
are of different sizes:
>>> ctypes.sizeof(ctypes.c_size_t)
8
>>> ctypes.sizeof(ctypes.c_uint)
4
How does ctypes
successfully call this function given arguments of different sizes?
Upvotes: 0
Views: 416
Reputation: 365767
The answer depends on the calling conventions of the ABI of the C compiler used to compile your Python.
It sounds like you're on x86-64 Windows.1 If so, your system is built around the Microsoft x64 ABI. And if not, that still makes for a good example, so let's pretend you are. Slightly oversimplified,2 the calling conventions for that ABI work like this:
So, your c_uint
arguments get stored in the low 32 bits of RCX and RDX, respectively, while the high 32 bits of each of those registers gets cleared to 0.
The add
function goes to add RCX and RDX as unsigned 64-bit ints, and the result is exactly what you'd expect; everything works.3
But imagine you were on a different platform, with a different ABI. In fact, your imagination doesn't have to go very far; if you run a 32-bit program on the same Windows machine, you get the Microsoft IA-32 ABI instead of Microsoft x64. That ABI has three different calling conventions, and that _cdecl
in your declaration now selects one of the three, which works like this:
OK, now c_uint
and size_t
both happen to be 32 bits, but let's do the same thing with c_ushort
.
Your Python code pushes two 16-bit values onto the stack.
add
tries to use the both of your values—as in x | (y<<32)
—as its x
parameter, and then whatever happens to be next to it on the stack as its y
parameter. So, what you get back is garbage.
And it can get even worse.
What if you'd used _stdcall
? In the Microsoft x64 ABI that does nothing, but in the Windows IA-32 ABI, it specifies the same parameter passing order as _cdecl
, but stack cleanup by the callee rather than the caller.
So, after generating your garbage for you, add
goes to clean up the stack, and it's expecting a different size than what you gave it, and… well, actually, I think in this specific case you get away with it because the parameter area of the stack is aligned to 16-byte pages, so cleaning up 16 bytes instead of 8 doesn't matter. But that's just dumb luck.
There are also some platforms that pass values in partial registers. For example, IIRC, the Win32s version of _fastcall
did something like this:
AL is just the low half of AX. and loading a byte into AL does not clear the high half. So, what happens if you call a _fastcall
function that wanted to add two 16-bit numbers, but you thought it wanted to add two 8-bit numbers? You get the sum of x
, y
, z*256
, and w*256
, where z
and w
are just whatever happened to be left around in AH
and DH
by some previous instruction.
There's a reason all of my weird examples came from 32-bit and smaller ABIs. Most 64-bit ABIs were designed more recently, and less haphazardly, and specifically to make POSIX/C code and/or Win64/C code run nicely, so they tend to be pretty similar. For example, the System V AMD64 ABI (used by almost everything but Windows on x86_64), the AArch64 ABI (used by almost everything on ARM64), and the PowerPC64 (used by everything on PowerPC64) all have basically the same calling convention as the Microsoft x64 ABI, except for a different set of integer-parameter registers, and slightly different floats-and-stuff rules. But that doesn't mean you can rely on it being safe to get the parameters wrong; it just means you have a harder time finding test systems to detect and debug your bugs…
1. You didn't say, but __declspec
and _cdecl
usually only appear in Windows code. And you said "a 64-bit system", and I doubt you're on Itanium or some other 64-bit platform.
2. There's some extra complexity for floats, SSE vectors, structs larger than 64 bits, varargs…
3. You might be a bit surprised that 0xffffffff + 0xffffffff
is 0x00000001fffffffe
instead of 0xfffffffe
… but since you got the restype
wrong as well, you're going to truncate that to 32 bits (and you’re on a little-endian system and one that returns values in registers—if both of those were not true, you’d get 1 as the answer…), and, since these are unsigned ints, truncating and rolling over look identical, so two errors cancel out and you see the 0xfffffffe
you expected.
Upvotes: 1