Reputation: 1407
How to define a ctype array buffer that can hold several numpy array of floats (say A, B, C) at one time point and then hold several numpy arrays of integers (say D, E) at another time point? Can this be done with some combination of ctypes, numpy, or multiprocessing in python?
Thank you. I am trying to use less memory.
Upvotes: 0
Views: 393
Reputation: 40713
First, is your program using too much memory? If the answer is "no" or "I'm not sure", then ignore this can carry on until you know you really do have a problem.
You can do all of what you want using "views" that are available within numpy. Views are just different ways of looking at the same data. For instance,
import numpy as np
ints32 = np.array([0, 0, 0, 0], dtype="<i4") # dtype string means little endian 4 byte ints
assert len(ints32) == 4
ints16 = ints32.view(dtype="<i2")
assert len(ints16) == 8 # 32-bit ints need half as much space as a 32-bit int
ints32[0] = 0x11223344
assert ints16[0] == 0x3344
print(ints16) # prints [13124 4386 0 0 0 0 0 0]
# Thus, showing ints16 is backed by the same memory as ints32
You can also use an external buffer if you wish
buffer = bytearray(8)
floats32 = np.frombuffer(buffer, dtype="<f4")
floats32[0] = 1
print(buffer) # shows buffer has been modified
You need to be careful as you may end up with alignment errors:
buf = np.zeros(3, dtype=np.int8) # 3 byte buffer
arr = buf.view(dtype=np.int16) # Error! Needs a buffer with multiples of 2 bytes
two_byte_slice = buf[:2]
arr = two_byte_slice.view(dtype=np.int16) # Succeeds
arr[0] = 1
assert buf[0] == 1 # shows that two_byte_slice and arr are not copies of buf
Sharing buffers with C libraries or other processes carries certain risks. This risks are usually mitigated by only copying over the buffer immediately and only using that. However, managed carefully, you can still be safe. For sharing a buffer with a C library, you must make sure:
Sharing the data with another process is more complicated. But can also be made safe.
See the following example for sharing a buffer with another process, and using a lock to synchronise access (strictly speaking the lock isn't necessary as the parent waits for the child to complete before continuing).
import numpy as np
import ctypes
from multiprocessing import Array, Process
def main():
buf = Array(ctypes.c_int8, 10) # 10 byte buffer
with buf: # acquire lock
ctypes_arr = buf.get_obj()
arr = np.frombuffer(ctypes_arr, dtype=np.int16) # int16 array, with size 5
total = arr.sum()
del arr, ctypes_arr # losing lock, delete local reference to the buffer
print("total before:", total) # 0
p = Process(target=subprocess_target, args=(buf,))
p.start()
p.join()
with buf:
# interpret first 8 bytes as two 4 byte ints
view = memoryview(buf.get_obj())[:8]
arr = np.frombuffer(view, dtype=np.int32)
total = arr.sum()
del arr, view
print("total after:", total) # 262146
raw_bytes = list(buf.get_obj())
assert raw_bytes == [0, 0, 1, 0, 2, 0, 3, 0, 4, 0]
def subprocess_target(buf):
"""Sets elements in buf to [0, 1, ..., n-2, n-1]"""
with buf:
arr = np.frombuffer(buf.get_obj(), dtype=np.int16)
arr[:] = range(len(arr))
del arr
if __name__ == "__main__":
main()
Upvotes: 1