Reputation: 73
My model is experiencing wild and big fluctuations in the validation loss and does not converge.
I am doing an image recognition project with my three dogs i.e. classifying the dog in the image. Two dogs are very similar and the 3rd is very different. I took 10 minute videos of each dog, separately. Frames were extracted as images at each second. My dataset consists of about 1800 photos, 600 of each dog.
This block of code is responsible for augmenting and creating the data to feed the model.
randomize = np.arange(len(imArr)) # imArr is the numpy array of all the images
np.random.shuffle(randomize) # Shuffle the images and labels
imArr = imArr[randomize]
imLab= imLab[randomize] # imLab is the array of labels of the images
lab = to_categorical(imLab, 3)
gen = ImageDataGenerator(zoom_range = 0.2,horizontal_flip = True , vertical_flip = True,validation_split = 0.25)
train_gen = gen.flow(imArr,lab,batch_size = 64, subset = 'training')
test_gen = gen.flow(imArr,lab,batch_size =64,subset = 'validation')
This picture is the result of the model below.
model = Sequential()
model.add(Conv2D(16, (11, 11),strides = 1, input_shape=(imgSize,imgSize,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(3,3),strides = 2))
model.add(BatchNormalization(axis=-1))
model.add(Conv2D(32, (5, 5),strides = 1))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(3,3),strides = 2))
model.add(BatchNormalization(axis=-1))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(3,3),strides = 2))
model.add(BatchNormalization(axis=-1))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(BatchNormalization(axis=-1))
model.add(Dropout(0.3))
#Fully connected layer
model.add(Dense(256))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(Dense(3))
model.add(Activation('softmax'))
sgd = SGD(lr=0.004)
model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
batch_size = 64
epochs = 100
model.fit_generator(train_gen, steps_per_epoch=(len(train_gen)), epochs=epochs, validation_data=test_gen, validation_steps=len(test_gen),shuffle = True)
Things I have tried.
Things I have not tried
It seems that there is some form of randomness or too many parameters in my model. I am aware that it is currently overfitting, but that should not be the cause of the volatility(?). I am not too worried about the performance of the model. I would like to achieve about a 70% accuracy. All I want to do now is to stabilise the validation accuracy and to converge.
Note:
Upvotes: 2
Views: 918
Reputation: 2206
Change the optimizer to Adam, definitely better. In your code you are using it but with default parameters, you are creating an SGD optimizer but in the compile line you introduce an Adam with no parameters. Play with the actual parameters of your optimizer.
I encourage you to take out the dropout first, see what is happening and the if you manage to overfit, start with low dropout and go up.
Also it might be due to some of your test samples are very hard to detect and thus increase the loss, maybe take out the shuffle in the validation set and watch for any peridiocities to try to find out if there are validation samples hard to detect.
Hope it helps!
Upvotes: 2
Reputation: 2010
I see you have tried a lot of different things. Few suggestions:
Conv2D
eg. 11x11
and 5x5
. If your image dimensions are not very big, you should definitely go for lower filter dimensions like 3x3
.Adam
with varying lr
if you haven't.Otherwise, I don't see much problems. Maybe you need more data for the network to learn better.
Upvotes: 1