Reputation: 643

optimize python code for memory efficiency

I have a python code as follow:

import numpy as np

sizes = 2000
array1 = np.empty((sizes, sizes, sizes, 3), dtype=np.float32)
for i in range(sizes):
    array1[i, :, :, 0] = 1.5*i
    array1[:, i, :, 1] = 2.5*i
    array1[:, :, i, 2] = 3.5*i

array2 = array1.reshape(sizes*sizes*sizes, 3)

#do something with array2

array3 = array2.reshape(sizes*sizes*sizes, 3)

I would want to optimize this code for memory efficient but I have no idea. Could I use "numpy.reshape" by a more memory efficient way?

Upvotes: 0

Answers (2)

ebarr

Reputation: 7842

So really your problem depends on what you are doing with the array. You are currently storing a large amount of redundant information. You could keep 0.15% of the currently stored information and not lose anything.

For instance, if we define the following three one dimensional arrays

a = np.linspace(0,(size-1)*1.5,size).astype(np.float32)
b = np.linspace(0,(size-1)*2.5,size).astype(np.float32)
c = np.linspace(0,(size-1)*3.5,size).astype(np.float32)

We can create any minor entry (i.e. entry in the fastest rotating axis) in your array1:

In [235]: array1[4][3][19] == np.array([a[4],b[3],c[19]])
Out[235]: array([ True,  True,  True], dtype=bool)

The use of this all depends on what you are doing with the array, as it will be less performant to remake array1 from a,b and c. However, if you are nearing the limits of what your machine can handle, sacrificing performance for memory efficiency may be a necessary step. Also moving a,b and c around will have a much lower overhead than moving array1 around.

Upvotes: 0

unutbu

Reputation: 880199

I think your code is already memory efficient.

When possible, np.reshape returns a view of the original array. That is so in this case and therefore np.reshape is already as memory efficient as can be.

Here is how you can tell np.reshape is returning a view:

import numpy as np
# Let's make array1 smaller; it won't change our conclusions
sizes = 5
array1 = np.arange(sizes*sizes*sizes*3).reshape((sizes, sizes, sizes, 3))

for i in range(sizes):
    array1[i, :, :, 0] = 1.5*i
    array1[:, i, :, 1] = 2.5*i
    array1[:, :, i, 2] = 3.5*i

array2 = array1.reshape(sizes*sizes*sizes, 3)

Note the value of array2 at a certain location:

assert array2[0,0] == 0

Change the corresponding value in array1:

array1[0,0,0,0] = 100

Note that the value of array2 changes.

assert array2[0,0] == 100

Since array2 changes due to a modification of array1, you can conclude that array2 is a view of array1. Views share the underlying data. Since there is no copy being made, the reshape is memory efficient.

array2 is already of shape (sizes*sizes*sizes, 3), so this reshape does nothing.

array3 = array2.reshape(sizes*sizes*sizes, 3)

Finally, the assert below shows array3 was also affected by the modification made to array1. So that proves conclusively that array3 is also a view of array1.

assert array3[0,0] == 100

Upvotes: 1

optimize python code for memory efficiency

Answers (2)

Related Questions