Reputation: 311
I'm trying to access two legacy de/compression functions from Python that are written in C and are currently available via a DLL (I have the C source).
The functions are passed a (partially) populated C struct and use this information to either compress or decompress the data in the buffer provided.
This is how the functions are called. I added __cdecl for Python compatibility.
// Both functions return 0 on success and nonzero value on failure
int __cdecl pkimplode(struct pkstream *pStr);
int __cdecl pkexplode(struct pkstream *pStr);
Here's the pkstream struct as defined in C:
struct pkstream {
unsigned char *pInBuffer; // Pointer to input buffer
unsigned int nInSize; // Size of input buffer
unsigned char *pOutBuffer; // Pointer to output buffer
unsigned int nOutSize; // Size of output buffer upon return
unsigned char nLitSize; // Specifies fixed or var size literal bytes
unsigned char nDictSizeByte; // Dictionary size; either 1024, 2048, or 4096
// The rest of the members of this struct are used internally,
// so setting these values outside pkimplode or pkexplode has no effect
unsigned char *pInPos; // Current position in input buffer
unsigned char *pOutPos; // Current position in output buffer
unsigned char nBits; // Number of bits in bit buffer
unsigned long nBitBuffer; // Stores bits until enough to output a byte
unsigned char *pDictPos; // Position in dictionary
unsigned int nDictSize; // Maximum size of dictionary
unsigned int nCurDictSize; // Current size of dictionary
unsigned char Dict[0x1000]; // Sliding dictionary used for compdecomp
};
This is my attempt at mirroring this struct in Python.
# Define the pkstream struct
class PKSTREAM(Structure):
_fields_ = [('pInBuffer', c_ubyte),
('nInSize', c_uint),
('pOutBuffer', c_ubyte),
('nOutSize', c_uint),
('nLitSize', c_ubyte),
('nDictSizeByte', c_ubyte),
('pInPos', c_ubyte),
('pOutPos', c_ubyte),
('nBits', c_ubyte),
('nBitBuffer', c_ulong),
('pDictPos', c_ubyte),
('nDictSize', c_uint),
('nCurDictSize', c_uint),
('Dict', c_ubyte)]
I would really appreciate some help with the following questions (which I'm choosing to ask questions on the front-end rather than just 'winging' it, hopefully for obvious reasons):
I'm not sure whether to use c_ubyte, c_char or c_char_p for the members of type unsigned char. c_ubyte most closely maps to the ctypes for unsigned char (according to the docs, at least), but is actually an ?int/long? in Python.
Sometimes the member is a pointer to an unsigned char ... would this map to c_char_p? ctypes docs say ALL byte & unicode strings are passed as pointers anyway, so what provisions do I need to make for this?
I need to provide pOutBuffer to the function, which should be a pointer to the location of allocated memory to which the function can copy the de/compressed data. I believe I should use create_string_buffer() to create an appropriately sized buffer for this?
I also need to know how to define the member Dict[0x1000], which looks (to me) to create a 4096 byte buffer for internal use within the functions. I know my definition is clearly wrong, but don't know how should it be defined?
Should the C functions be decorated as __stdcall or __cdecl? (I've been using the latter on some test DLLs as I've worked my way up to this point).
Any feedback would be VERY much appreciated!
Thanks in advance,
James
Upvotes: 3
Views: 2835
Reputation: 110311
If the data in the structure is a pointer, you have to declare it as a pointer on the Python side as well.
One way to do that is to use the POINTER
utility in ctypes - it is an object on a somewhat higher level than ctypes.c_char_p
(and not fully compatible with that) - but your code will become more readable. Also, for simulating C arrays, the base ctypes types can be multiplied by a scalar, and the returned object is one that can be used as a C vector of the base type of the same size - (so the Dict field can be defined as bellow, c_ubyte * 4096
)
Note that while char
is equivalent to c_ubyte
, int
is equivalent to c_int
instead of c_uint
and likewise for long
.
Your structure definition does not state that the pointed to buffers are const
. If you pass a python string (immutable) and your library tries to alter it you will get errors. Instead you should pass mutable memory that is returned from create_string_buffer
, initialised by your string.
POINTER = ctypes.POINTER
# Define the pkstream struct
class PKSTREAM(Structure):
_fields_ = [('pInBuffer', POINTER(c_char)),
('nInSize', c_int),
('pOutBuffer', POINTER(c_char)),
('nOutSize', c_int),
('nLitSize', c_char),
('nDictSizeByte', c_char),
('pInPos', POINTER(c_char)),
('pOutPos', POINTER(c_char)),
('nBits', c_char),
('nBitBuffer', c_long),
('pDictPos', POINTER(c_char)),
('nDictSize', c_int),
('nCurDictSize', c_int),
('Dict', c_char * 0x1000)]
As for (5), I don't know how you should decorate your C functions - use whatever works.
Upvotes: 2