Calling C functions via Python ctypes: why does passing uints to a function expecting size_t work?

Question

I have some simple code that adds two size_ts together:

#include 

extern "C" __declspec(dllexport) size_t _cdecl add(size_t x, size_t y)
{
    return x + y;
}

(Note: this code is compiled and run on a 64-bit system.)

When calling that function via Python's ctypes and passing it arguments of type c_uint (32 bits in size instead of 64), the function works as expected:

import ctypes

lib = ctypes.cdll['./ctypetest.dll']

add = lib.add

add.restype = ctypes.c_uint
add.argtypes = [ctypes.c_uint, ctypes.c_uint]

add(1, 2) # = 3

As a sanity check, I verified that both uint and size_t are of different sizes:

>>> ctypes.sizeof(ctypes.c_size_t)
8
>>> ctypes.sizeof(ctypes.c_uint)
4

How does ctypes successfully call this function given arguments of different sizes?

abarnert · Accepted Answer

The answer depends on the calling conventions of the ABI of the C compiler used to compile your Python.

It sounds like you're on x86-64 Windows.¹ If so, your system is built around the Microsoft x64 ABI. And if not, that still makes for a good example, so let's pretend you are. Slightly oversimplified,² the calling conventions for that ABI work like this:

The first four arguments are stored in registers RCX, RDX, R8, and R9.
Any additional arguments are pushed onto the stack.

So, your c_uint arguments get stored in the low 32 bits of RCX and RDX, respectively, while the high 32 bits of each of those registers gets cleared to 0.

The add function goes to add RCX and RDX as unsigned 64-bit ints, and the result is exactly what you'd expect; everything works.³

But imagine you were on a different platform, with a different ABI. In fact, your imagination doesn't have to go very far; if you run a 32-bit program on the same Windows machine, you get the Microsoft IA-32 ABI instead of Microsoft x64. That ABI has three different calling conventions, and that _cdecl in your declaration now selects one of the three, which works like this:

Push everything on the stack.

OK, now c_uint and size_t both happen to be 32 bits, but let's do the same thing with c_ushort.

Your Python code pushes two 16-bit values onto the stack.

add tries to use the both of your values—as in x | (y<<32)—as its x parameter, and then whatever happens to be next to it on the stack as its y parameter. So, what you get back is garbage.

And it can get even worse.

What if you'd used _stdcall? In the Microsoft x64 ABI that does nothing, but in the Windows IA-32 ABI, it specifies the same parameter passing order as _cdecl, but stack cleanup by the callee rather than the caller.

So, after generating your garbage for you, add goes to clean up the stack, and it's expecting a different size than what you gave it, and… well, actually, I think in this specific case you get away with it because the parameter area of the stack is aligned to 16-byte pages, so cleaning up 16 bytes instead of 8 doesn't matter. But that's just dumb luck.

There are also some platforms that pass values in partial registers. For example, IIRC, the Win32s version of _fastcall did something like this:

First argument in EAX if 32-bit, AX if 16-bit, AL if 8-bit.
Second argument in EDX if 32-bit, DX if 16-bit, DL if 8-bit.
Third argument in EBX if 32-bit, BX if 16-bit, BL if 8-bit.
Everything else on the stack.

AL is just the low half of AX. and loading a byte into AL does not clear the high half. So, what happens if you call a _fastcall function that wanted to add two 16-bit numbers, but you thought it wanted to add two 8-bit numbers? You get the sum of x, y, z*256, and w*256, where z and w are just whatever happened to be left around in AH and DH by some previous instruction.

There's a reason all of my weird examples came from 32-bit and smaller ABIs. Most 64-bit ABIs were designed more recently, and less haphazardly, and specifically to make POSIX/C code and/or Win64/C code run nicely, so they tend to be pretty similar. For example, the System V AMD64 ABI (used by almost everything but Windows on x86_64), the AArch64 ABI (used by almost everything on ARM64), and the PowerPC64 (used by everything on PowerPC64) all have basically the same calling convention as the Microsoft x64 ABI, except for a different set of integer-parameter registers, and slightly different floats-and-stuff rules. But that doesn't mean you can rely on it being safe to get the parameters wrong; it just means you have a harder time finding test systems to detect and debug your bugs…

_{1. You didn't say, but __declspec and _cdecl usually only appear in Windows code. And you said "a 64-bit system", and I doubt you're on Itanium or some other 64-bit platform.}

_{2. There's some extra complexity for floats, SSE vectors, structs larger than 64 bits, varargs…}

_{3. You might be a bit surprised that 0xffffffff + 0xffffffff is 0x00000001fffffffe instead of 0xfffffffe… but since you got the restype wrong as well, you're going to truncate that to 32 bits (and you’re on a little-endian system and one that returns values in registers—if both of those were not true, you’d get 1 as the answer…), and, since these are unsigned ints, truncating and rolling over look identical, so two errors cancel out and you see the 0xfffffffe you expected.}

Calling C functions via Python ctypes: why does passing uints to a function expecting size_t work?

Answers (1)

Related Questions