Alisher Narzulloyev
Alisher Narzulloyev

Reputation: 85

Hyperparameter optimization of MLPRegressor in scikit-learn

I am very new in machine learning using python and would appreciate any help with the following problem.

I am trying to run MLPRegressor for list of different hidden neuron numbers (6 values) and for each selected neuron number I want the training data to be shuffled three times, i.e. three scores for each neuron number. The following code works fine and returns 18 scores (6*3). However I feel it is not the efficient way of solving the problem, since it is running almost an hour. I have tried using GridSearchCV(), but I don't know how to control shuffling of the training data (3 times for each hidden neuron numbers). Can anybody suggest a better (faster) way of solving this?

from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import cross_val_score
from sklearn.utils import shuffle

n=3 # how many times to shuffle the training data
nhn_range=[8,10,12,14,16,18] # number of hidden neurons

nhn_scores = []
for nhn in nhn_range:
    mlp = MLPRegressor(hidden_layer_sizes=(nhn,), activation='tanh', 
                       solver='adam', shuffle=False, random_state=42, 
                       max_iter=20000, momentum=0.7, early_stopping=True, 
                       validation_fraction=0.15)
    for _ in range(n):
        df_train = shuffle(df_train)
        score = np.sqrt(-cross_val_score(mlp, df_train[feature_cols], 
                        df_train[response_cols], 
                        cv=5, scoring='neg_mean_squared_error')).mean()
        nhn_scores.append(score)

The code returns a list of scores. How can I get a simple data frame with 3 rows (for each shuffling) and 6 columns (for each hidden neuron number).

Thanks in advance

Upvotes: 4

Views: 5963

Answers (1)

Gambit1614
Gambit1614

Reputation: 8801

Try this

score_dict = {}
for nhn in nhn_range:
    mlp = MLPRegressor(hidden_layer_sizes=(nhn,), activation='tanh', 
                       solver='adam', shuffle=False, random_state=42, 
                       max_iter=20000, momentum=0.7, early_stopping=True, 
                       validation_fraction=0.15)


    nhn_scores = []
    for _ in range(n):

        df_train = shuffle(df_train)
        score = np.sqrt(-cross_val_score(mlp, df_train[feature_cols], 
                    df_train[response_cols], 
                    cv=5, scoring='neg_mean_squared_error')).mean()
        nhn_scores.append(score)
    score_dict[nhn] = nhn_scores

Then convert score_dict to a dataframe like this using from_dict

import pandas as pd
score_df = pd.DataFrame.from_dict(score_dict)

Upvotes: 3

Related Questions