Reputation: 41
I am training a Neural Network (NN). First, I considered a NN with just one hidden layer. Second, I tried to construct a NN with two hidden layers. The number of neurons on the layer of these two NN was kept constant as 5. The other parameters of the training model are described as follow in the code:
for 1 hidden layer:
regr = MLPRegressor(hidden_layer_sizes=(5,), activation='logistic',solver='sgd',alpha=0.0001 ,learning_rate='constant', learning_rate_init=0.001 ,random_state=1).fit(X_treino, Y_treino)
for 2 hidden layer:
regr = MLPRegressor(hidden_layer_sizes=(5,5,), activation='logistic',solver='sgd',alpha=0.0001 ,learning_rate='constant', learning_rate_init=0.001 ,random_state=1).fit(X_treino, Y_treino)
Nevertheless, the Scoring of the second NN is much worse than the first. I don't know why.... Can anyone explain to me?
The DATASET that I am using is a avaiable on " http://archive.ics.uci.edu/ml/datasets/Airfoil+Self-Noise ", it leads to a regression problem.
Upvotes: 1
Views: 1523
Reputation: 73
Creating bigger NNs does not always create better results, the best explanation I can think of is gradient descent. Gradient descent is how the weights of a model (in this case the size of the layers and the amount of nodes in each layer), affect the model's learning capabilities. Cross validation between multiple combinations of weights, can be used to estimate the best weights for a given problem, which is called hyperparameter tuning. hyperparameter tuning can be done with the MLPRegressor like this:
param_grid = {
'hidden_layer_sizes': [(150,100,50), (120,80,40), (100,50,30)],
'max_iter': [50, 100],
'activation': ['tanh', 'relu'],
'solver': ['sgd', 'adam'],
'alpha': [0.0001, 0.05],
'learning_rate': ['constant','adaptive'],
}
grid = GridSearchCV(mlp_classifier, param_grid, n_jobs= -1, cv=5)
grid.fit(X_train, y_train)
print(grid.best_params_)
the parameters can be chosen as you want and they can be seen at: https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html, try different combinations, to get better results.
Upvotes: 1