Muhammad Khan
Muhammad Khan

Reputation: 45

Extracting features and labels from images before feeding to a densely connected classifier in Convolutional Network

I am trying to extract features and labels from images and then feed them to a densely connected classifier VGG16. The function to extract features is given below.

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(rescale=1./255)
batch_size = 30
def extract_features(dataframe,directory, sample_count,x_col,y_col):
  features = np.zeros(shape=(sample_count, 4, 4, 512))
  labels = np.zeros(shape=(sample_count))
  generator = datagen.flow_from_dataframe(dataframe,directory,
  x_col,
  y_col,
  target_size=(150, 150),batch_size=batch_size,class_mode='raw')
  i = 0
  for inputs_batch, labels_batch in generator:
    features_batch = conv_base.predict(inputs_batch)
    features[i * batch_size : (i + 1) * batch_size] = features_batch
    labels[i * batch_size : (i + 1) * batch_size] = labels_batch
    i += 1
    if i * batch_size >= sample_count:
      break
  return features, labels

But when I try

train_features, train_labels = extract_features(dataframe=combined[:790],directory=directory1,x_col='file_name',y_col=target_columns,sample_count=790)
#validation_features, validation_labels = extract_features(combined[790:1002],directory=directory1, sample_count=212)

I get the following error.

Found 790 validated image filenames.

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-44-9dd7d3b4ac22> in <module>()
----> 1 train_features, train_labels = extract_features(dataframe=combined[:790],directory=directory1,x_col='file_name',y_col=target_columns,sample_count=790)

<ipython-input-42-c13d7901073d> in extract_features(dataframe, directory, sample_count, x_col, y_col)
     14     features_batch = conv_base.predict(inputs_batch)
     15     features[i * batch_size : (i + 1) * batch_size] = features_batch
---> 16     labels[i * batch_size : (i + 1) * batch_size] = labels_batch
     17     i += 1
     18     if i * batch_size >= sample_count:

ValueError: could not broadcast input array from shape (30,63) into shape (30)

It should be noted that the label and data batch shapes of my data are given below.

for data_batch, labels_batch in train_generator:
  print('data batch shape:', data_batch.shape)
  print('labels batch shape:', labels_batch.shape)
  break

data batch shape: (32, 150, 150, 3) labels batch shape: (32, 63) i applied one hot encoding.The dataframe has total 64 columns.The first column is the "feature_name" which is the X-column and the remaining 63 columns are the targets

In [72]:

combined.columns

Out[72]:

Index(['file_name', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10',
       '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22',
       '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34',
       '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46',
       '47', '48', '49', '50', '51', '52', '53', '54', '55', '58', '60', '61',
       '62', '63', '67', '69'],
      dtype='object')

Upvotes: 0

Views: 289

Answers (1)

Marco Cerliani
Marco Cerliani

Reputation: 22031

in your extract_features function, try to initialize the label arrays in this way:

labels = np.zeros(shape=(sample_count, len(y_col)))

Upvotes: 1

Related Questions