Reputation: 141
So, I'm doing a 4 label x-ray images classification on around 12600 images:
Class1:4000
Class2:3616
Class3:1345
Class4:4000
I'm using VGG-16 architecture pertained on the imageNet dataset with cross-entrpy and SGD and a batch size of 32 and a learning rate of 1e-3 running on pytorch
[[749., 6., 50., 2.],
[ 5., 707., 9., 1.],
[ 56., 8., 752., 0.],
[ 4., 1., 0., 243.]]
I know since both train loss/acc are relatively 0/1 the model is overfitting, though I'm surprised that the val acc is still around 0.9! How to properly interpret that and what causing it and how to prevent it? I know it's something like because the accuracy is the argmax of softmax like the actual predictions are getting lower and lower but the argmax always stays the same, but I'm really confused about it! I even let it train for +64 epochs same results flat acc while loss increases gradually!
PS. I have seen other questions with answers and didn't really get an explanation
Upvotes: -1
Views: 909
Reputation: 2720
I think your question already says about what is going on. Your model is overfitting as you have also figured out. Now, as you are training more your model slowly becoming more specialized to the train set and loosing the the capability to generalize gradually. So the softmax probabilities are getting more and more flat. But still it is showing more or less the same accuracy for validation set as still now the correct class has at least slightly more probability than the others. So in my opinion there can be some possible reasons for this:
Upvotes: 3