train, validation, test set in relation to evaluation metrics

Question

I am getting a little confused about the notion of training, validation and test set and what exactly they should be used for.

My understanding is that we make a split in our data (for example 70% train, 10% validation, 20% test) and take the following steps:

Initialize model (choose the model we want to use)
Train model with the training data.
Predict the length of the validation period using the model
Evaluate metrics like MSE og RMSE by using the validation data
Tune hyperparameters of the model to optimize a chosen metric (whatever chosen in 3.)
With the optimal model, predict the length of the test period using the model
Lastly, evaluate the metrics chosen again by using the test data. This will be the actual performance of the model if it were to be used in production.

In sklearn this would (roughly) be equal to (using ARIMA model as example):

Have i understood the process of training, validation and test sets of data correctly?

Also, how would i incorporate cross-validation in this scenario?

Answers (1)