Adding an additional dimension to ndarray

Question

I have and ndarray defined in the following way:

dataset = np.ndarray(shape=(len(image_files), image_size, image_size),
                         dtype=np.float32)

This array represents a collection of images of size image_size * image_size. So I can say, dataset[0] and get a 2D table corresponding to an image with index 0.

Now I would like to have one additional field for each image in this array. For instance, for image located at index 0, I would like to store number 123, for an image located at index 321 I would like to store number 50000.

What is the simplest way to add this additional data field to the existing ndarray? What is the appropriate way to access data in the new array after adding this additional dimension?

hpaulj · Accepted Answer

If you shuffle an index array instead of the dataset itself, you can keep track of the original 'identifiers'

idx = np.arange(len(image_files))
np.random.shuffle(idx)
shuffle_set = dataset[idx]

illustration:

In [20]: x = np.arange(12).reshape(6,2)
    ...: idx = np.arange(6)
    ...: np.random.shuffle(idx) 
In [21]: x
Out[21]: 
array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])
In [22]: x[idx]             # shuffled
Out[22]: 
array([[ 4,  5],
       [ 0,  1],
       [ 2,  3],
       [ 6,  7],
       [10, 11],
       [ 8,  9]])
In [23]: idx1=np.argsort(idx)
In [24]: idx
Out[24]: array([2, 0, 1, 3, 5, 4])
In [25]: idx1
Out[25]: array([1, 2, 0, 3, 5, 4])
In [26]: Out[22][idx1]       # recover original order
Out[26]: 
array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])

Adding an additional dimension to ndarray

Answers (2)

Related Questions