Reputation: 136865
I have about 150,000 images which I want to load in a numpy array of shape [index][y][x][channel]
. Currently, I do it like this:
images = numpy.zeros((len(data), 32, 32, 1))
for i, fname in enumerate(data):
img = scipy.ndimage.imread(fname, flatten=False, mode='L')
img = img.reshape((1, img.shape[0], img.shape[1], 1))
for y in range(32):
for x in range(32):
images[i][y][x][0] = img[0][y][x][0]
This works, but I think there must be a better solution than iterating over the elements. I could get rid of the reshaping, but this would still leave the two nested for-loops.
What is the fastest way to achive the same images
4D array, having 150,000 images which need to be loaded into it?
Upvotes: 3
Views: 142
Reputation: 231738
Essentially there are 2 approaches
res = np.zeros((<correct shape>), dtype)
for i in range(...):
img = <load>
<reshape if needed>
res[i,...] = img
If you've chosen the initial shape of res
correctly you should be able copy each image array into its slot without loop or much reshaping.
The other approach uses list append
alist = []
for _ in range(...):
img = <load>
<reshape>
alist.append(img)
res = np.array(alist)
this collects all component arrays into a list, and uses np.array
to join them into one array with a new dimension at the start. np.stack
gives a little more power in selecting the concatenation
axis.
Upvotes: 0
Reputation: 152860
Generally you don't need to copy single elements when dealing with numpy-arrays. You can just specify the axis (if they are equal sized or broadcastable) you want to copy your array to and/or from:
images[i,:,:,0] = img[0,:,:,0]
instead of your loops. In fact you don't need the reshape at all:
images[i,:,:,0] = scipy.ndimage.imread(fname, flatten=False, mode='L')
These :
specify that you want these axis to be preserved (not sliced) and numpy supports array to array assignments, for example:
>>> a = np.zeros((3,3,3))
>>> a[0, :, :] = np.ones((3, 3))
>>> a
array([[[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]],
[[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]])
or
>>> a = np.zeros((3,3,3))
>>> a[:, 1, :] = np.ones((3, 3))
>>> a
array([[[ 0., 0., 0.],
[ 1., 1., 1.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 1., 1., 1.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 1., 1., 1.],
[ 0., 0., 0.]]])
Upvotes: 2