Interpretation of train-validation loss of a Neural Network

Question

I have trained an LSTM model for Time Series Forecasting. I have used an early stopping method with a patience of 150 epochs. I have used a dropout of 0.2, and this is the plot of train and validation loss:

The early stopping method stop the training after 650 epochs, and save the best weight around epoch 460 where the validation loss was the best.

My question is : Is it normal that the train loss is always above the validation loss? I know that if it was the opposite(validation loss above the train) it would have been a sign of overfitting. But what about this case?

EDIT: My dataset is a Time Series with hourly temporal frequence. It is composed of 35000 instance. I have split the data into 80 % train and 20% validation but in temporal order. So for example the training will contain the data until the beginning of 2017 and the validation the data from 2017 until the end. I have created this plot by averaging the data over 15 days and this is the result:

So maybe the reason is as you said that the validation data have an easier pattern. How can i solve this problem?

kerastf · Accepted Answer

Usually the opposite is true. But since you are using drop out,it is common to have the validation loss less than the training loss.And like others have suggested try k-fold cross validation

Interpretation of train-validation loss of a Neural Network

Answers (2)

Related Questions