Reputation: 6260
super simliar to this post: ValueError: 'balanced_accuracy' is not a valid scoring value in scikit-learn
I am using:
scoring = ['precision_macro', 'recall_macro', 'balanced_accuracy_score']
clf = DecisionTreeClassifier(random_state=0)
scores = cross_validate(clf, X, y, scoring=scoring, cv=10, return_train_score=True)
And i receive the error:
ValueError: 'balanced_accuracy_score' is not a valid scoring value. Use sorted(sklearn.metrics.SCORERS.keys()) to get valid options.
I did the recommended solution and upgraded scikit (in the enviornment):
When I check the possible scorers:
sklearn.metrics.SCORERS.keys()
dict_keys(['explained_variance', 'r2', 'max_error', 'neg_median_absolute_error', 'neg_mean_absolute_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_root_mean_squared_error', 'neg_mean_poisson_deviance', 'neg_mean_gamma_deviance', 'accuracy', 'roc_auc', 'roc_auc_ovr', 'roc_auc_ovo', 'roc_auc_ovr_weighted', 'roc_auc_ovo_weighted', 'balanced_accuracy', 'average_precision', 'neg_log_loss', 'neg_brier_score', 'adjusted_rand_score', 'homogeneity_score', 'completeness_score', 'v_measure_score', 'mutual_info_score', 'adjusted_mutual_info_score', 'normalized_mutual_info_score', 'fowlkes_mallows_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'jaccard', 'jaccard_macro', 'jaccard_micro', 'jaccard_samples', 'jaccard_weighted'])
I can stil not find it? Where is the problem?
Upvotes: 5
Views: 5860
Reputation: 8187
According to the docs for valid scorers, the value of the scoring
parameter corresponding to the balanced_accuracy_score
scorer function is "balanced_accuracy"
as in my other answer:
Change:
scoring = ['precision_macro', 'recall_macro', 'balanced_accuracy_score']
to:
scoring = ['precision_macro', 'recall_macro', 'balanced_accuracy']
and it should work.
I do find the documentation a bit lacking in this respect, and this convention of removing the _score
suffix is not consistent either, as all the clustering metrics still have _score
in their names in their scoring
parameter values.
Upvotes: 7