Transfer learning in Keras: ValueError: Error when checking target: expected dense_26 to have shape (None, 3) but got array with shape (3000, 1)

Question

I have a very simple question related to transfer learning and the VGG16 NN.

Here is my code:

    import numpy as np
    from keras.preprocessing.image import ImageDataGenerator
    from keras.models import Sequential
    from keras.layers import Dropout, Flatten, Dense
    from keras import applications
    from keras import optimizers
    from keras.applications.vgg16 import VGG16
    from keras.applications.vgg16 import preprocess_input

    img_width, img_height = 150, 150
    top_model_weights_path = 'full_hrct_model_weights.h5'
    train_dir = 'hrct_data_small/train'
    validation_dir = 'hrct_data_small/validation'
    nb_train_samples = 3000
    nb_validation_samples = 600
    epochs = 50
    batch_size = 20


    def save_bottleneck_features():
        datagen = ImageDataGenerator(rescale=1. / 255)

        # build the vgg16 model
        model = applications.VGG16(include_top=False, weights='imagenet')

        generator = datagen.flow_from_directory(
            train_dir, 
            target_size=(img_width, img_height), 
            shuffle=False, 
            class_mode=None,
            batch_size=batch_size
        )  


        bottleneck_features_train = model.predict_generator(generator=generator, steps=nb_train_samples // batch_size)

        np.save(file="bottleneck_features_train_ternary_class.npy", arr=bottleneck_features_train)


        generator = datagen.flow_from_directory(
            validation_dir, 
            target_size=(img_width, img_height), 
            shuffle=False,
            class_mode=None,  
            batch_size=batch_size,    
        )

        bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples // batch_size)

        np.save(file="bottleneck_features_validate_ternary_class.npy", arr=bottleneck_features_validation)

    save_bottleneck_features()

"Found 3000 images belonging to 3 classes."

"Found 600 images belonging to 3 classes."

 def train_top_model():

     train_data = np.load(file="bottleneck_features_train_ternary_class.npy")
     train_labels = np.array([0] * (nb_train_samples // 2) + [1] * (nb_train_samples // 2))

     validation_data = np.load(file="bottleneck_features_validate_ternary_class.npy")
     validation_labels = np.array([0] * (nb_validation_samples // 2) + [1] * (nb_validation_samples // 2))

     model = Sequential()
     model.add(Flatten(input_shape=train_data.shape[1:]))  # don't need to tell batch size in input shape
     model.add(Dense(256, activation='relu'))
      model.add(Dense(3, activation='sigmoid'))

     print(model.summary)

     model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])

     model.fit(train_data, train_labels,
               epochs=epochs,
               batch_size=batch_size,
               validation_data=(validation_data, validation_labels))

     model.save_weights(top_model_weights_path)

train_top_model()

The error I get is this:

  ValueError                                Traceback (most recent call last)
   in ()
        2                epochs=epochs,
        3                batch_size=batch_size,
  ----> 4                validation_data=(validation_data, validation_labels))    

  /Users/simonalice/anaconda/lib/python3.5/site-packages/keras/models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)
      854                               class_weight=class_weight,
      855                               sample_weight=sample_weight,
  --> 856                               initial_epoch=initial_epoch)
      857 
      858     def evaluate(self, x, y, batch_size=32, verbose=1,

  /Users/simonalice/anaconda/lib/python3.5/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)
     1427             class_weight=class_weight,
     1428             check_batch_axis=False,
  -> 1429             batch_size=batch_size)
     1430         # Prepare validation data.
     1431         if validation_data:

  /Users/simonalice/anaconda/lib/python3.5/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_batch_axis, batch_size)
     1307                                     output_shapes,
     1308                                     check_batch_axis=False,
  -> 1309                                     exception_prefix='target')
     1310         sample_weights = _standardize_sample_weights(sample_weight,
     1311                                                      self._feed_output_names)

  /Users/simonalice/anaconda/lib/python3.5/site-packages/keras/engine/training.py in _standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
      137                             ' to have shape ' + str(shapes[i]) +
      138                             ' but got array with shape ' +
  --> 139                             str(array.shape))
      140     return arrays
      141 

  ValueError: Error when checking target: expected dense_32 to have shape (None, 3) but got array with shape (3000, 1)

Here is the model summary:

   Layer (type)                 Output Shape              Param #   
   =================================================================
    flatten_16 (Flatten)         (None, 8192)              0         
   _________________________________________________________________
    dense_31 (Dense)             (None, 256)               2097408   
   _________________________________________________________________
    dropout_16 (Dropout)         (None, 256)               0         
   _________________________________________________________________
    dense_32 (Dense)             (None, 3)                 771       
   =================================================================
     Total params: 2,098,179
     Trainable params: 2,098,179
     Non-trainable params: 0

My difficulty here highlights a fundamental misunderstanding on my part I suspect but I need some very straightforward explanation. I have 3 classes in this training. 'hrct_data_small/train' contains 3 folders and 'hrct_data_small/validation' contains 3 folders.

First: Am I correct in thinking that the last layer of the top model:

   model.add(Dense(3, activation='sigmoid'))

should be "3" as I have 3 classes.

Second:

I grabbed the data shapes to investigate

  train_data = np.load(file="bottleneck_features_train_ternary_class.npy")
  train_labels = np.array([0] * (nb_train_samples // 2) + [1] * (nb_train_samples // 2))
  validation_data =np.load(file="bottleneck_features_validate_ternary_class.npy")
  validation_labels = np.array([0] * (nb_validation_samples // 2) + [1] * (nb_validation_samples // 2))

Then

  print("Train data shape", train_data.shape)
  print("Train_labels shape", train_labels.shape)
  print("Validation_data shape", validation_labels.shape)
  print("Validation_labels", validation_labels.shape)

And the result is

  Train data shape (3000, 4, 4, 512)
  Train_labels shape (3000,)
  Validation_data shape (600,)
  Validation_labels (600,)

So, should the "Train data shape" variable be in the shape (3000, 3).

My apologies for these basic questions - if I can simply get some clear thinking on this I's be grateful.

EDIT: So thanks to the advice of Naseem below, I addressed all his points except:

The train_data was returning the training data in order, so the first class (1000) then the second (1000) and the the third (1000). Therefore the train_labels need to be in that order:

  train_data = np.load(file="bottleneck_features_train_ternary_class.npy")
  train_labels = np.array([0] * 1000 + [1] * 1000 + [2] * 1000)
  validation_data = np.load(file="bottleneck_features_validate_ternary_class.npy")
  validation_labels = np.array([0] * 400 + [1] * 400 + [2] * 400)

I then fixed the labels like so:

  train_labels = np_utils.to_categorical(train_labels, 3)
  validation_labels = np_utils.to_categorical(validation_labels, 3)

Which got the labels into the right shape and one-hot encoded them. I examined the first few and they were correct. The model then worked.

As an additional comment - All of the answers were in the Keras documentation. If I had spent a bit more time reading and less time cutting and pasting code I would have got it right. Lesson learned.

Nassim Ben · Accepted Answer

I'm not sure this will clear everything in yout mind but here are some errors I see in your code :

The way you create your labels is really weird to me. Why do you put half of the data as 0 and the other half as 1 in :
```
train_labels = np.array([0] * (nb_train_samples // 2) + [1] * (nb_train_samples // 2)) 
```
this doesn't seem right. You need to explain a bit more what you are trying to predict. Normally your labels should be produced with the generator and set class_mode='categorical' instead of class_mode=None this will make the generator output both inputs and targets, and targets will be a serie of one hot encoded vectors of length 3.
The loss you are using is loss='binary_crossentropy'. This is used when you are doing classification of images that can go into multiple categories, or when you have only 2 possibilities of class. This is not your case (if I understand correctly). You should use : loss='categorical_crossentropy'. This is when each image has one and no more than one class as target.
This is linked to the previous point, the activation of the last layer : model.add(Dense(3, activation='sigmoid')). The sigmoid will allow your output to be [1 1 0] or [1 1 1] or [0 0 0], those are non valid as in your case you want to predict only one class, you don't want your image to be classified as belonging to the 3 classes. What we use for this classification case is a softmax. The softmax will normalize the outputs so that they sum up to 1. You can now interpret the output as probabilities : [0.1 0.2 0.7], the image has 10% probability to belong to the first class, 20% to the second class and 70% to the third. So I would change to : model.add(Dense(3, activation='softmax'))

So to summarize, the network complains because for each image, it expects the target that you provide to be a one-hot vector of length 3, encoding the class that the image belongs to. What you are currently feeding it is just one number 0 or 1.

Does it make more sense ?

Transfer learning in Keras: ValueError: Error when checking target: expected dense_26 to have shape (None, 3) but got array with shape (3000, 1)

Answers (1)

Related Questions