change
change

Reputation: 3558

effeciently manage memory and python processing

I need to read a file that is 2.5gb big. i use the following syntx:

import numpy as np
limit=4269*352*288*3
mat1=np.zeros(limit, dtype='uint8')
mat1=np.fromfile("path/to/file", "uint8", limit)

I now need to reshape this array:

mat_new=np.zeros([4269, 288, 352, 3], dtype='uint8')
mat_new=np.transpose(np.reshape(mat1,[4269,3,288,352]), [0,2,3,1])

This takes about 35 seconds on my 4gb ram, i7 second generation system. Any way i could make it faster. Since this is just the start of my program and more complex things to come ahead. I do not need mat1 henceforth!

Also i am reading only half the file to start with since Python gives me 'maximum memory reached' error.

Upvotes: 2

Views: 381

Answers (1)

jfs
jfs

Reputation: 414485

it takes about 18 seconds to read 1.25GB of data

It is about 70MB/s i.e., the speed is probably limited by your disk I/O performance.

If you don't need the whole file; you could use numpy.memmap:

mat = np.memmap("path/to/file", shape=(4269,288,352,3))

To avoid memory errors, use inplace operations e.g.:

mat_new = mat1.transpose(0,2,3,1) # no data copying

Also as @HYRY said, remove unnecessary np.zeros().

Upvotes: 1

Related Questions