osm
osm

Reputation: 622

Filtering a numpy array using another array of labels

Given two numpy arrays, i.e:

images.shape: (60000, 784) # An array containing 60000 images
labels.shape: (60000, 10)  # An array of labels for each image

Each row of labels contains a 1 at a particular index to indicate the class of the related example in images. (So [0 0 1 0 0 0 0 0 0 0] would indicate that the example belongs to Class 2 (assuming our class indexing starts from 0).

I am trying to efficiently separate images so that I can manipulate all images belonging to a particular class at once. The most obvious solution would be to use a for loop (as follows). However, I'm not sure how to filter images such that only those with the appropriate labels are returned.

for i in range(0, labels.shape[1]):
  class_images = # (?) Array containing all images that belong to class i

As an aside, I'm also wondering if there are even more efficient approaches that would eliminate the use of the for loop.

Upvotes: 0

Views: 1405

Answers (2)

Paul Panzer
Paul Panzer

Reputation: 53029

One way would be to convert your label array to bool and use it for indexing:

classes = []
blabels = labels.astype(bool)
for i in range(10):
    classes.append(images[blabels[:, i], :])

Or as a one-liner using list comprehension:

classes = [images[l.astype(bool), :] for l in labels.T]

Upvotes: 1

Pouyan
Pouyan

Reputation: 85

_classes= [[] for x in range(10)]
for image_index , element in enumerate(labels):
    _classes[element.index(1)].append(image_index)

for example the _classes[0] will contain the indexes of images which are classified as class0 .

Upvotes: 0

Related Questions