Reputation: 1529
I'm trying to do transfer learning by fine-tuning a VGG16 network (pre-trained on ImageNet) to do image classification on a substantially smaller dataset (11000 images, 200 classes). I'm actually only training the last 3 FC layers of the modified VGG16 network. I've added dropout on two of the 3 FC layers with a probability of 0.5.
So, when I'm training on this network, I'm not doing any fancy pre-processing except to subtract each channel in the image with the VGG_MEAN values given by the original authors.
So the thing is the training seems to proceed well, loss goes down substantially and stabilizes around a certain value, and I'm monitoring the network's prediction accuracy on a validation set (20% of the data) after specific number of batches have been trained. I'm noticing that the average validation accuracy does not show any trends in improvement -- and the average validation accuracies fluctuates throughout the training when I was actually hoping this to increase gradually. I've made sure not to shuffle the validation data when inferencing.
I have tried reducing the learning rate, fine-tuning fewer layers but to no avail.If loss is a surrogate to indicate that the model is actually learning, then why the discrepancy in validation accuracy?
(1) Is it because I have only very little training data to begin with ? (2) The original Imagenet dataset has 1000 classes, but my classification task is more fine-grained, and has 1/5th of the original number of ImageNet classes (think of classifying species of birds, or different primates). Could this be an issue ? I would like some opinion and feedback from ppl who have some experience working on problems like this.
Upvotes: 0
Views: 888
Reputation: 51
I know the value of (training) loss has strong relation with train accuracy. If loss decreases when training, the training accuracy will increase. But no strong relation between train loss with validation accuracy. If train loss is decreasing and validation accuracy is increasing, then this is what we expect. But if train loss is decreasing while validation accuracy drops to saturation, then overfitting may happend. And in this case, the training should be stopped and adjust some parameters such as weight decay for regularization and dropout rate.
So the validation accuracy cannot be directly replaced with the training loss. If possible, the validation accuracy should be used rather only seeing the curve of learning loss.The above is my understand.
Upvotes: 1
Reputation: 1529
It turns out that I was faced with a "fine-grained" classification problem. Images of bird species looked very similar to each other and this posed a problem for the network to learn.
Upvotes: 0