otterb
otterb

Reputation: 2710

stack images as numpy array faster (than preallocation)?

I often need to stack 2d numpy arrays (tiff images). For that, I first append them in a list and use np.dstack. This seems to be the fastest way to get 3D array stacking images. But, is there a faster/memory-efficient way?

from time import time
import numpy as np

# Create 100 images of the same dimention 256x512 (8-bit). 
# In reality, each image comes from a different file
img = np.random.randint(0,255,(256, 512, 100))

t0 = time()
temp = []
for n in range(100):
    temp.append(img[:,:,n])
stacked = np.dstack(temp)
#stacked = np.array(temp)  # much slower 3.5 s for 100

print time()-t0  # 0.58 s for 100 frames
print stacked.shape

# dstack in each loop is slower
t0 = time()
temp = img[:,:,0]
for n in range(1, 100):
    temp = np.dstack((temp, img[:,:,n]))
print time()-t0  # 3.13 s for 100 frames
print temp.shape

# counter-intuitive but preallocation is slightly slower
stacked = np.empty((256, 512, 100))
t0 = time()
for n in range(100):
    stacked[:,:,n] = img[:,:,n]
print time()-t0  # 0.651 s for 100 frames
print stacked.shape

# (Edit) As in the accepted answer, re-arranging axis to mainly use 
# the first axis to access data improved the speed significantly.
img = np.random.randint(0,255,(100, 256, 512))

stacked = np.empty((100, 256, 512))
t0 = time()
for n in range(100):
    stacked[n,:,:] = img[n,:,:]
print time()-t0  # 0.08 s for 100 frames
print stacked.shape

Upvotes: 4

Views: 7459

Answers (1)

Magellan88
Magellan88

Reputation: 2573

After some joint effort with otterb, we concluded that preallocating of the array is the way to go. Apparently the performance killing bottleneck was the array layout with the image number (n) being the fastest changing index. If we make n the first index of the array (which will default to the "C" ordering: first index changest slowest, last index changes fastest) we get the best performance:

from time import time
import numpy as np

# Create 100 images of the same dimention 256x512 (8-bit). 
# In reality, each image comes from a different file
img = np.random.randint(0,255,(100, 256, 512))

# counter-intuitive but preallocation is slightly slower
stacked = np.empty((100, 256, 512))
t0 = time()
for n in range(100):
    stacked[n] = img[n]
print time()-t0  
print stacked.shape

Upvotes: 6

Related Questions