stayhydrated03
stayhydrated03

Reputation: 66

'GridSearchCV' object has no attribute 'best_params_' when using LogisticRegression

Below is the code that I am trying to execute

# Train a logistic regression model, report the coefficients and model performance 
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import cross_val_score
from sklearn import metrics

clf = LogisticRegression().fit(X_train, y_train)
params = {'penalty':['l1','l2'],'dual':[True,False],'C':[0.001, 0.01, 0.1, 1, 10, 100, 1000], 'fit_intercept':[True,False],
        'solver':['saga']}
gridlog = GridSearchCV(clf, params, cv=5, n_jobs=2, scoring='roc_auc')

cv_scores = cross_val_score(gridlog, X_train, y_train)

#find best parameters
print('Logistic Regression parameters: ',gridlog.best_params_) # throws error

The last code line above is where the error is being thrown from. I have used this exact same code to run other models. Any idea why I may be facing this issue?

Upvotes: 1

Views: 8305

Answers (2)

s.dallapalma
s.dallapalma

Reputation: 1315

Your code should be updated such that the LogisticRegression classifier is passed to the GridSearch (not its fit):

from sklearn.datasets import load_breast_cancer # For example only
X_train, y_train = load_breast_cancer(return_X_y=True)

params = {'penalty':['l1', 'l2'],'dual':[True, False],'C':[0.001, 0.01, 0.1, 1, 10, 100, 1000], 'fit_intercept':[True, False],
        'solver':['saga']}

gridlog = GridSearchCV(LogisticRegression(), params, cv=5, n_jobs=2, scoring='roc_auc')
gridlog.fit(X_train, y_train)

#find best parameters
print('Logistic Regression parameters: ', gridlog.best_params_) # Now it displays all the parameters selected by the grid search

Results

Logistic Regression parameters:  {'C': 0.1, 'dual': False, 'fit_intercept': True, 'penalty': 'l2', 'solver': 'saga'}

Note, as @desertnaut pointed out, you don't use cross_val_score for GridSearchCV.

See a complete example of how to use GridSearch here. The example use a SVC classifier instead of a LogisticRegression, but the approach is the same.

Upvotes: 1

Mehul Gupta
Mehul Gupta

Reputation: 1939

You need to fit gridlog first. cross_val_score will not do this, it returns the scores & nothing else. Hence, as gridlog isn't trained, it throws error.

Below code works perfectly fine:

from sklearn import datasets
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
diabetes = datasets.load_breast_cancer()
x = diabetes.data[:150]
y = diabetes.target[:150]
clf = LogisticRegression().fit(x, y)
params = {'C':[0.001, 0.01, 0.1, 1, 10, 100, 1000]}
gridlog = GridSearchCV(clf, params, cv=2, n_jobs=2, 
scoring='roc_auc')
gridlog.fit(x,y) # <- missing in your code
cv_scores = cross_val_score(gridlog, x, y)
print(cv_scores)
#find best parameters
print('Logistic Regression parameters: ',gridlog.best_params_)
# result:
Logistic regression parameters: {'C': 1}

Upvotes: 2

Related Questions