mayona
mayona

Reputation: 41

GridsearchCV best score drop when using the best parameters to build model

I'm trying to found a set of best hyperparameters for my Logistic Regression estimator with Grid Search CV and build the model using pipeline:

my problem is when trying to use the best parameters I get through grid_search.best_params_ to build the Logistic Regression model, the accuracy is different from the one I get by

grid_search.best_score_ 

Here is my code

x=tweet["cleaned"]
y=tweet['tag']

X_train, X_test, Y_train, Y_test = model_selection.train_test_split(x, y, test_size=.20, random_state=42)

pipeline = Pipeline([
('vectorizer',TfidfVectorizer()),
('chi', SelectKBest()),
('classifier', LogisticRegression())])

grid = {
'vectorizer__ngram_range': [(1, 1), (1, 2),(1, 3)],
'vectorizer__stop_words': [None, 'english'],
'vectorizer__norm': ('l1', 'l2'),
'vectorizer__use_idf':(True, False), 
'vectorizer__analyzer':('word', 'char', 'char_wb'),
'classifier__penalty': ['l1', 'l2'],
'classifier__C': [1.0, 0.8],
'classifier__class_weight': [None, 'balanced'],
'classifier__n_jobs': [-1],
'classifier__fit_intercept':(True, False),
}

grid_search = GridSearchCV(pipeline, param_grid=grid, scoring='accuracy', n_jobs=-1, cv=10)
grid_search.fit(X_train,Y_train)

and when I get best score and pram using

print(grid_search.best_score_)
print(grid_search.best_params_)

the result is

0.7165160230073953 
{'classifier__C': 1.0, 'classifier__class_weight': None, 'classifier__fit_intercept': True, 'classifier__n_jobs': -1, 'classifier__penalty': 'l1', 'vectorizer__analyzer': 'word', 'vectorizer__ngram_range': (1, 1), 'vectorizer__norm': 'l2', 'vectorizer__stop_words': None, 'vectorizer__use_idf': False}

Now if I use these parameters to build my model

pipeline = Pipeline([
('vectorizer',TfidfVectorizer(ngram_range=(1, 1),stop_words=None,norm='l2',use_idf= False,analyzer='word')),
('chi', SelectKBest(chi2,k=1000)),
('classifier', LogisticRegression(C=1.0,class_weight=None,fit_intercept=True,n_jobs=-1,penalty='l1'))])

 model=pipeline.fit(X_train,Y_train) 
 print(accuracy_score(Y_test, model.predict(X_test)))

the result drops to 0.68.

also, it is tedious work, so how can I pass the best parameters to model. I could not figure out how to do it like in this(answer) since my way is slightly different than him.

Upvotes: 3

Views: 2696

Answers (2)

I put both Logistic Regression and MLPClassifier in a pipeline switching between each classifier. I used GridSearchCV to find the best parameters between the classifiers. I adjusted the parameters then selected the most accurate classifier for the data. Originally the MLPClassifier was more accurate but after adjusting the C value for the logistic regression, it became more accurate.

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.4,random_state=42)

pipeline= Pipeline([
     ('scaler',StandardScaler()),
     #('pca', PCA()),
     ('clf',LogisticRegression(C=5,max_iter=10000, tol=0.1)),
     #('clf',MLPClassifier(hidden_layer_sizes=(25,150,25),  max_iter=800, solver='lbfgs', activation='relu', alpha=0.7, 
     #                 learning_rate_init=0.001,  verbose=False, momentum=0.9, random_state=42))
     ])

 pipeline.fit(X_train,y_train)

 parameter_grid={'C':np.linspace(5,100,5)}

 grid_rf_class=GridSearchCV(
      estimator=pipeline['clf'],
      param_grid=parameter_grid,
      scoring='roc_auc',
      n_jobs=2,
      cv=5,
      refit=True,
      return_train_score=True)

 grid_rf_class.fit(X_train,y_train)
 predictions=grid_rf_class.predict(X_test)

 print(accuracy_score(y_test,predictions));
 print(grid_rf_class.best_params_)
 print(grid_rf_class.best_score_)

Upvotes: 0

MaximeKan
MaximeKan

Reputation: 4211

The reason why your score is lower in the second option is because you are evaluating your pipeline model on the test set, whereas you are evaluating your gridsearch model using cross-validation (in your case, a 10-fold stratified cross-validation). This cross-validation score is the average of 10 models fitted each on 9/10 of your train data and evaluated on the last 1/10 of this train data. Hence, you cannot expect the same score from both evaluations.

As far your second question, why can't you just do grid_search.best_estimator_ ? This takes the best model from your grid search and you can evaluate it without rebuilding it from scratch. For instance:

best_model = grid_search.best_estimator_
best_model.score(X_test, Y_test)

Upvotes: 7

Related Questions