Reputation:
I am following the guide done by Rajsha: https://github.com/rajshah4/image_keras/blob/master/notebook_extras.ipynb
The idea is to apply VGG16 to my dataset which is composed of spectograms and let it decide between 2 classes, normal and abnormal.
However, the model isn't learning, i get around 0.5 val_acc despite my top layer.
Am I doing something wrong? I'll leave my code below:
# dimensions of our images
img_width, img_height = 240, 240
train_data_dir = '/content/gdrive/My Drive/Melspec/melspecimages/train'
validation_data_dir = '/content/gdrive/My Drive/Melspec/melspecimages/val'
batch_size = 32
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
model_vgg = applications.VGG16(include_top=False, weights='imagenet',input_shape=(240,240,3))
model_vgg.trainable=False
train_generator_bottleneck = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary',
shuffle=True)
validation_generator_bottleneck = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary',
shuffle=False)
train_samples = 30272
validation_samples = 7584
bottleneck_features_train = model_vgg.predict_generator(train_generator_bottleneck, train_samples // batch_size)
np.save(open('/content/gdrive/My Drive/Melspec/spec_vgg_bottleneck_features_train.npy', 'wb'), bottleneck_features_train)
bottleneck_features_validation = model_vgg.predict_generator(validation_generator_bottleneck, validation_samples // batch_size)
np.save(open('/content/gdrive/My Drive/Melspec/spec_vgg_bottleneck_features_validation.npy', 'wb'), bottleneck_features_validation)
train_data = np.load(open('/content/gdrive/My Drive/Melspec/spec_vgg_bottleneck_features_train.npy', 'rb'))
train_labels = np.array([0] * (train_samples // 2) + [1] * (train_samples // 2))
validation_data = np.load(open('/content/gdrive/My Drive/Melspec/spec_vgg_bottleneck_features_validation.npy', 'rb'))
validation_labels = np.array([0] * (validation_samples // 2) + [1] * (validation_samples // 2))
model_top = Sequential()
model_top.add(Flatten(input_shape=train_data.shape[1:]))
model_top.add(Dense(256, activation='relu'))
model_top.add(Dropout(0.5))
model_top.add(Dense(1, activation='sigmoid'))
model_top.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
model_top.fit(train_data, train_labels,
epochs=epochs,
batch_size=batch_size,
validation_data=(validation_data, validation_labels))
```
Upvotes: 1
Views: 36
Reputation: 11
Found the answer: my labels were wrong.
I had read in the web that we should use shuffle=True when feeding the train_generator, however the classes are not mixed in the same order, only the files, thus leading to the wrong labelling.
I switched to shuffle=False and also class_mode=None.
I also had to make sure that the files in my database had the same number in both classes and that these were divisible by my batch_size.
Hope this helps other beginners !
Upvotes: 1