MeiNan Zhu
MeiNan Zhu

Reputation: 1059

Sk-learn GridSearchCV fits on full data

I used sklearn GridSearchCV to search # of topics using lda model. After fitting the model, the fitted model is saved in CV_model.best_estimator_. Based on sklearn docs, GridSearchCV has default option refit, default=True, which 'Refit an estimator using the best found parameters on the whole dataset.' Sklearn GridSearchCV

Since the document says the it has already fit on the full data, I therefore believed CV_model.best_estimator_.fit_transform(full_train_data) shall bring the same result as CV_model.best_estimator_.transform(full_train_data). However, outputs from using fit_transform and transform differ. What did I miss? Should I use fit_transform or transform after GridsearchCV?

Upvotes: -1

Views: 366

Answers (1)

MeiNan Zhu
MeiNan Zhu

Reputation: 1059

I realized it might be due to the unfixed random state, after I assigned a fixed random state, .transform() and .fit_transform() return same results.

Upvotes: 0

Related Questions