Reputation: 169
Hello I am getting overfitting with resnet-50 pretrained weights. I am trying to train RGB images of files and the dataset I am using comes with training and validation sets. I have 26 classes and about 14k images, 9 k training and 5k testing.
The name of data set is maleviz
My validation accuracy is very low and my training accuracy reaches 1.000. My validation doesn't go over 0.50-0.55 so seems to be overfitting I think.. Is there something wrong with data like per class samples or is there something wrong with my model?
I expect resnet to perform well on this...
Here is my code:
import tensorflow as tf
import keras
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
import keras
from keras.models import Sequential, Model, load_model
from tensorflow.keras.optimizers import Adam
from keras.callbacks import EarlyStopping,ModelCheckpoint
from keras.layers import Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization,Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D,MaxPool2D
from keras.preprocessing import image
from keras.initializers import glorot_uniform
from keras.applications.resnet import ResNet50
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
samples = ImageDataGenerator().flow_from_directory(directory='malevis_train_val_300x300/train', target_size=(300,300))
imgs, labels = next(samples)
print(imgs.shape, labels.shape)
samples2 = ImageDataGenerator().flow_from_directory(directory='malevis_train_val_300x300/val', target_size=(300,300))
imgs2, labels2 = next(samples2)
classes = samples.class_indices.keys()
y = (sum(labels)/labels.shape[0])*100
plt.xticks(rotation='vertical')
plt.bar(classes,y)
plt.show()
X_train, y_train = imgs,labels
X_val, y_val = imgs2,labels2
def define_model():
model = ResNet50(weights = 'imagenet', pooling = 'avg', include_top = False, input_shape =(300,300,3))
for layer in model.layers:
layer.trainable = False
flat1 = Flatten()(model.layers[-1].output)
class1 = Dense(256,activation='relu',)(flat1)
output = Dense(26,activation='softmax')(class1)
model = Model(inputs = model.inputs, outputs=output)
opt = Adam(lr =0.001)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
return model
model = define_model()
model.summary()
history1 = model.fit(X_train,y_train, validation_data=(X_val,y_val), epochs = 200,batch_size = 20, steps_per_epoch = 4,shuffle=True)
scores = model.evaluate(X_val,y_val)
print('Final accuracy:', scores[1])
acc = history1.history['accuracy']
val_acc = history1.history['val_accuracy']
loss = history1.history['loss']
val_loss = history1.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'r', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend(loc=0)
plt.figure()
plt.show()
I have tried different optimizers, loss functions, target size, and added epochs per step.. Nothing really makes much different it still overfits. I am using softmax activation and freezing the layers and removing top. I just then add dense layer and output layer for 26 classes.I have tried with shuffling true and false
Upvotes: 2
Views: 800
Reputation: 2670
I would like to suggest you a few things, one of them might be helpful:
classes
parameter inside flow_from_directory()
make sure you have the proper folder structure as the documentation requires: flow_from_directorycategorical_crossentropy
to sparse_categorical_crossentropy
if your output labels are not one-hot encoded. Ref: Probabilistic losses | SparseCategoricalCrossentropyUpvotes: 1