Lalu
Lalu

Reputation: 155

np.save for large data in a for loop

As far I know np.save writes the array all at once outside the loop. Is it possible to put np.save inside the loop and save each row appending to a single file. The data I have is very large and if I declare an array of size (200k X 20k), python crashes. I know the shape of the array.

Sample code

for doPr in range(original_file.shape[0]):
    each_row = original_file[doPr, :]
    each_row = each_row + add_something
    np.save(path+"original_mod_file.npy", each_row)

original_file I read from hard drive row per row. original_mod_file.npy is what I want to write row per row and it will be of size (200k X 20k)

Upvotes: 0

Views: 1497

Answers (1)

N. Kiefer
N. Kiefer

Reputation: 337

Copying from https://numpy.org/doc/stable/reference/generated/numpy.save.html

with open('test.npy', 'wb') as f:
    np.save(f, np.array([1, 2]))
    np.save(f, np.array([1, 3]))

open the file you want to save to and edit the mode ('rb') to 'ab+'. This creates a file if it doesn't exists and let's you append to it. You can then just append your data in a loop.

Upvotes: 1

Related Questions