Streaming multiple numpy arrays to a file

Question

This differs from Write multiple numpy arrays to file in that I need to be able to stream content, rather than writing it all at once.

I need to write multiple compressed numpy arrays in binary to a file. I can not store all the arrays in memory before writing so it is more like streaming numpy arrays to a file.

This currently works fine as text

file = open("some file")

while doing stuff: file.writelines(somearray + " ") where some array is a new instance every loop

however this does not work if i try and write the arrays as binary.

arrays are created at 30hz and grow too big to keep in memory. They also can not each be stored into a bunch of single array files because that would just be wasteful and cause a huge mess.

So i would like only one file per a session instead of 10k files per a session.

Sebastian Mendez · Accepted Answer

One option might be to use pickle to save the arrays to a file opened as an append binary file:

import numpy as np
import pickle
arrays = [np.arange(n**2).reshape((n,n)) for n in range(1,11)]
with open('test.file', 'ab') as f:
    for array in arrays:
        pickle.dump(array, f)

new_arrays = []        
with open('test.file', 'rb') as f:
    while True:
        try:
            new_arrays.append(pickle.load(f))
        except EOFError:
            break
assert all((new_array == array).all() for new_array, array in zip(new_arrays, arrays))

This might not be the fastest, but it should be fast enough. It might seem like this would take up more data, but comparing these:

x = 300
y = 300
arrays = [np.random.randn(x, y) for x in range(30)]

with open('test2.file', 'ab') as f:
    for array in arrays:
        pickle.dump(array, f)

with open('test3.file', 'ab') as f:
    for array in arrays:
        f.write(array.tobytes())

with open('test4.file', 'ab') as f:
    for array in arrays:
        np.save(f, array)

You'll find the file sizes as 1,025 KB, 1,020 KB, and 1,022 KB respectively.

Streaming multiple numpy arrays to a file

Answers (2)

Related Questions