Reputation: 811
Trying to fit data with GaussianNB()
gives me low accuracy score.
I'd like to try Grid Search, but it seems that parameters sigma
and theta
cannot be set. Is there anyway to tune GausssianNB
?
Upvotes: 20
Views: 33999
Reputation: 885
In an sklearn pipeline it may look as follows:
pipe = Pipeline(steps=[
('pca', PCA()),
('estimator', GaussianNB()),
])
parameters = {'estimator__var_smoothing': [1e-11, 1e-10, 1e-9]}
Bayes = GridSearchCV(pipe, parameters, scoring='accuracy', cv=10).fit(X_train, y_train)
print(Bayes.best_estimator_)
print('best score:')
print(Bayes.best_score_)
predictions = Bayes.best_estimator_.predict(X_test)
Upvotes: 1
Reputation: 341
As of version 0.20
GaussianNB().get_params().keys()
returns 'priors' and 'var_smoothing'
A grid search would look like:
pipeline = Pipeline([
('clf', GaussianNB())
])
parameters = {
'clf__priors': [None],
'clf__var_smoothing': [0.00000001, 0.000000001, 0.00000001]
}
cv = GridSearchCV(pipeline, param_grid=parameters)
cv.fit(X_train, y_train)
y_pred_gnb = cv.predict(X_test)
Upvotes: 4
Reputation: 368
You can tune 'var_smoothing' parameter like this:
nb_classifier = GaussianNB()
params_NB = {'var_smoothing': np.logspace(0,-9, num=100)}
gs_NB = GridSearchCV(estimator=nb_classifier,
param_grid=params_NB,
cv=cv_method, # use any cross validation technique
verbose=1,
scoring='accuracy')
gs_NB.fit(x_train, y_train)
gs_NB.best_params_
Upvotes: 16