Is my model underfitting, tensorflow?

Question

My loss first decreased for few epochs but then started increasing and then increased up to a certain point and then stopped moving. I think now it has converged. Now, can we say that my model is underfitting? Because my interpretation is that (slide 93 link) if my loss is going down and then increasing it means that I have a high learning rate and which after every 2 epochs I'm decaying so after few epochs loss stopped increasing because learning rate is low now, because I'm still decaying my learning rate, now loss should start decreasing again, according to slide 93 because learning rate is low, but it doesn't. Can we say that loss is not decreasing further because my model is underfitting?

Dennis Soemers · Accepted Answer

So, to summarize, the loss on the training data:

first went down
then it went up again
then it remains at the same level
then the learning rate is decayed
and the loss doesn't go back down again (still stays the same with decayed learning rate)

To me it sounds like the learning rate was too high initially, and it got stuck in a local minimum afterwards. Decaying the learning rate at that point, once it's already stuck in a local minimum, is not going to help it escape that minimum. Setting the initial learning rate at a lower value is more likely to be beneficial, so that you don't end up in the ''bad'' local minimum to begin with.

It is possible that your model is now underfitting, and that making the model more complex (more nodes in hidden layers, for instance) would help. This is not necessarily the case though.

Are you using any techniques to avoid overfitting? For example, regularization and/or dropout? If so, it is also possible that your model was initially overfitting (when the loss was going down, before it went back up again). To get a better idea of what's going on, it would be beneficial to plot not only your loss on the training data, but also loss on a validation set. If the loss on your training data drops significantly below the loss on the validation data, you know it's overfitting.

Is my model underfitting, tensorflow?

Answers (1)

Related Questions