Reputation: 405
How to go about making available the clf.best_params_
after carrying a pipeline
? For the code I have below, I get an:
AttributeError: 'GridSearchCV' object has no attribute 'best_params_
'
Here is my code:
from sklearn.datasets import make_classification
import numpy as np
from sklearn import metrics
from sklearn.metrics import accuracy_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV, GridSearchCV
f, (ax1,ax2) = plt.subplots(nrows=1, ncols=2,figsize=(20,8))
# Generate noisy Data
num_trainsamples = 500
num_testsamples = 50
X_train,y_train = make_classification(n_samples=num_trainsamples,
n_features=240,
n_informative=9,
n_redundant=0,
n_repeated=0,
n_classes=10,
n_clusters_per_class=1,
class_sep=9,
flip_y=0.2,
#weights=[0.5,0.5],
random_state=17)
X_test,y_test = make_classification(n_samples=50,
n_features=num_testsamples,
n_informative=9,
n_redundant=0,
n_repeated=0,
n_classes=10,
n_clusters_per_class=1,
class_sep=10,
flip_y=0.2,
#weights=[0.5,0.5],
random_state=17)
from sklearn.pipeline import Pipeline
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
pipe = Pipeline([('scaler', StandardScaler()),
('pca', PCA(n_components=0.95)),
('clf', RandomForestClassifier())])
# Declare a hyperparameter grid
parameter_space = {
'clf__n_estimators': [10,50,100],
'clf__criterion': ['gini', 'entropy'],
'clf__max_depth': np.linspace(10,50,11),
}
clf = GridSearchCV(pipe, parameter_space, cv = 5, scoring = "accuracy", verbose = True) # model
pipe.fit(X_train,y_train)
print(f'Best Parameters: {clf.best_params_}')
Upvotes: 0
Views: 305
Reputation: 2851
Your clf
is never fitted. You probably meant clf.fit(X_train,y_train)
.
Also, np.linspace(10,50,11)
yields floats, while max_depth
expects ints, so this may fail and you should probably add a type cast there (like np.linspace(10,50,11).astype('int')
) or use something like arange()
instead.
You should likely also fix your test set, which currently has no relation with the train one. Last but not least, PCA is not guaranteed to be useful for classification (see e.g. https://www.csd.uwo.ca/~oveksler/Courses/CS434a_541a/Lecture8.pdf).
Upvotes: 1