Reputation: 81
I am trying to find the optimum paramaters for my MLP Regressor through parameters tuning using RandomizedSearch CV and then GridSearchCV. My question is should I mark early_stopping as True or False and why? I have read about it online but couldn't understand it well, based on what I read it, if it is marked as True then it can be useful to avoid overfitting?
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import RandomizedSearchCV
mlp = MLPRegressor(random_state=42)
param_grid_random = {'hidden_layer_sizes': [(18,), (18,18,), (18,18,18,)],
'activation': ['tanh','relu','logistic'],
'solver': ['sgd', 'adam'],
'learning_rate': ['constant','adaptive','invscaling'],
'alpha': [0.0001, 0.05],
'max_iter': [10000000000],
'early_stopping': [False],
'warm_start': [False]}
GS_random = RandomizedSearchCV(mlp, param_distributions=param_grid_random,n_jobs= 1,cv=5, scoring='r2', n_iter=100,random_state=42) #scoring='neg_mean_squared_error'
GS_random.fit(X_train, y_train)
print(GS_random.best_params_)
Upvotes: 1
Views: 1000
Reputation: 175
Yes, you are correct that early stopping is used to prevent overfitting.
If set to true, the solver will automatically set aside some validation data from the training set (10%) and will stop iterating when the classification score against the validation set stops improving regardless of if there is still imporvement in the loss function to be gained by continuing to iterate over the training set; which would be the normal criteria for stopping.
In theory this prevents any further training at the point where generlisation is curtailed. It should be an ongoing check for over-fitting.
You'll notice a lot of shoulds and 'in theories' because of the random nature of selecting 10% of the data, it's not guaranteed to prevent over-fitting but it'll do a rather good job.
Exact details and in the docs: https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html
Wether or not you should change it to true or false in this case depends on your data and the purpose of your model, i.e. if overfitting matters, how prone your data might be to overfitting, wether or not you have enough data in your training set that you can afford to set aside 10% to perform ongoing validation against and several other reasons that I won't have encountered yet.
The complexity will be how it interacts with the RandomizedSearch which will partially be finding the parameters that avoid overfitting anyway. Setting early stopping to True could help optimise the search by already dealing with overfitting.
However, I belive it would be the case that the results of the optimum parameters given by Randomized Search would only be valid for the setting of early stopping selected, true or false.
Upvotes: 1