Chong Lip Phang
Chong Lip Phang

Reputation: 9279

Questions on Scikit-Learn early stopping

I have some questions on Scikit-Learn MLPRegressor when early stopping is enabled:

  1. Is the validation data (see 'validation_fraction') randomly selected, at the front, or at the back of the test data supplied?

  2. Is the validation data the same or different during successive iterations of the training?

  3. Will the validation data automatically be included/refit during the final stage of the training?

  4. When the validation score is not improving by at least tol for n_iter_no_change consecutive epochs, will the previous best regressor be returned, or will the fit() function simply return the last regressor?

Upvotes: 0

Views: 1636

Answers (1)

mujjiga
mujjiga

Reputation: 16906

Is the validation data (see 'validation_fraction') randomly selected, at the front, or at the back of the test data supplied?

MLPRegressor uses train_test_split internally to create the validation data. If shuffle argument to the MLPRegressor is set to false then the fraction is taken from the end of the test data. If the shuffle is set to true then the data is randomly selected.

Is the validation data the same or different during successive iterations of the training?

Validation data is same for all the iterations of training

Will the validation data automatically be included/refit during the final stage of the training?

Validation data will never be used for training the model. It is used only for scoring the model.

When the validation score is not improving by at least tol for n_iter_no_change consecutive epochs, will the previous best regressor be returned, or will the fit() function simply return the last regressor?

If the validation score is not improving, instead of continuing, the early stopping will stop training the model (avoid overfitting) and return the models best parameter (link)

Upvotes: 1

Related Questions