piotr.wierzgala
piotr.wierzgala

Reputation: 101

Why it is possible to modify immutable bytes object using ctypes in python 3?

bytes object is immutable. It doesn't support item assignment:

>>> bar = b"bar"
>>> bar[0] = b"#"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'bytes' object does not support item assignment

str object is also immutable:

>>> bar = "bar"
>>> bar[0] = "#"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

It is possible to modify bytes object with ctypes while it is not possible to do the same with str object. Could you explain why? Please have a look at the following examples.

c code

char* foo(char *bar) {
    bar[0] = '#';
    return bar;
}

c code compilation

gcc -shared -o clib.so -fPIC clib.c

bytes attempt

python code

import ctypes

clib = ctypes.CDLL('./clib.so')

bar = b"bar"
print("Before:", bar, id(bar))

clib.foo(bar)
print("After: ", bar, id(bar))

python code output

Before: b'bar' 140451244811328
After:  b'#ar' 140451244811328

str attempt

str object is also immutable in Python 3 but unlike bytes object it's not possible to modify it with ctypes.

python code

import ctypes

clib = ctypes.CDLL('./clib.so')

bar = "bar"
print("Before:", bar, id(bar))

clib.foo(bar)
print("After: ", bar, id(bar))

python code output

Before: bar 140385853714080
After:  bar 140385853714080

Upvotes: 1

Views: 2126

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177725

str in Python 3 is abstracted as Unicode and can be stored as 1-, 2-, or 4-byte per character strings depending on the highest Unicode character used in the string. To pass the string to a C function it must be converted to a specific representation. ctypes in this case is passing the converted temporary buffer to C and not the original. ctypes can crash and corrupt Python if you prototype functions incorrectly or send immutable objects to functions that mutate the contents and it is up to the user to be careful in these cases.

In the bytes case ctypes passes along a pointer to its internal buffer of the bytes, but doesn't expect it to be modified. Consider:

a = b'123'
b = b'123'

Since bytes are immutable, Python is free to store the same reference in both a and b. If you pass b to a ctypes-wrapped function and it modifies it, it could corrupt a as well.

Straight from the ctypes documentation:

You should be careful, however, not to pass [immutable objects] to functions expecting pointers to mutable memory. If you need mutable memory blocks, ctypes has a create_string_buffer() function which creates these in various ways....

Upvotes: 5

Related Questions