Python Programmer
Python Programmer

Reputation: 25

Python not passing correct size of string to C

When I try passing a 16 character string from python to C and scramble it, I keep getting random error codes back.

s = ctypes.c_wchar_p("H86ETJJJJHGFTYHr")


print(libc.hash_password(s))

At the start of the code I added a statement to return the size of the string back to python, however it keeps returning a value of 8

if (sizeof(my_string) != 17) return sizeof(my_string);

If I try to return a single element of the array, it will return a number, which I am assuming is the ascii value of the character, and the code does not error out.

This works for the last element as well, which is correctly recognised as a null.

The code works within C itself perfectly. So how could I get C to accept the correct size string, or python to accept the return string?

EDIT: Forgot to mention, when I do

sizeof(*my_string)

it returns a 1

EDIT 2: Here is the function definition

unsigned char *hash_password(char *input_string)

Upvotes: 0

Views: 921

Answers (3)

Mark Tolonen
Mark Tolonen

Reputation: 177725

In Python 3, "H86ETJJJJHGFTYHr" is a str object made up of Unicode codepoints. Your C function declaration is unsigned char *hash_password(char *input_string). Python str is marshaled as wchar_t* when passed via ctypes, not char*. Use a bytes object for that.

Assuming sizeof is ctypes.sizeof, it works like C and returns the size of the equivalent C object. for a c_wchar_p, that's a w_char_t*, and pointers typically have a size of 4 or 8 bytes (32- or 64-bit OS). It is not the length of the string.

It's also always a good idea to declare the arguments types and return type of a function when using ctypes, so it can check for type and number of arguments correctly, instead of guessing:

import ctypes

dll = ctypes.CDLL('./your.dll')
dll.hash_password.argtypes = ctypes.c_char_p,
dll.hash_password.restype = ctypes.c_char_p

A quick-and-dirty example (note printf returns length of string printed):

>>> from ctypes import *
>>> dll = CDLL('msvcrt')
>>> dll.printf('hello\n')  # ctypes assume wchar_t* for str, so passes UTF16-encoded data
h1                         # of 68 00 65 00 ... and prints only to first null, 1 char.
>>> dll.printf.argtypes=c_char_p, # tell ctypes the correct argument type
>>> dll.printf('hello\n')           # now it detects str is incorrect.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ctypes.ArgumentError: argument 1: <class 'TypeError'>: wrong type
>>> dll.printf(b'hello\n')          # pass bytes, and `char*` is marshaled to C
hello
6

Upvotes: 2

sizeof returns the size of an object in memory. This is not the same thing as the length of a string.

In your C code, my_string is a pointer to a string. sizeof(my_string) is the size of this pointer: 8 bytes on a 64-bit machine. sizeof(*my_string) is the size of what my_string points to. Since you're getting 1, it likely means that there's another problem in your C code, which is that you're mixing up single-byte characters (char, whose size is always 1 by definition) and wide characters (wchar_t, whose size is almost always 2 or 4).

Your string is a null-terminated wide character string. To obtain its length in C, call wcslen. Note that this means that your whole string processing code must use wchar_t and wcsxxx functions. If your string is a byte string, use char, strlen and other functions that work on byte strings.

Upvotes: 1

Tony Suffolk 66
Tony Suffolk 66

Reputation: 9704

In C sizeof doesn't ever return the length of the string it returns the size in memory of the variable.

For a string declared as

char *string;

Then string is a pointer to a character, and on your system it seems like pointers are 64 bits (i.e. 8 bits).

When you do *string in C you get the content of the first element that string points to - i.e. a single character.

To get the length of a string in C, use strlen(my_string).

Upvotes: 1

Related Questions