Reputation: 71
I need to calculate the ROC Curve for two multiclass variables and I'm getting an error I haven't been able to work out.
Here's my code and the error I'm getting:
Input:
auc = roc_auc_score(df["two_year_recid"], df["decile_score"])
auc
Output:
ValueError Traceback (most recent call last)
<ipython-input-128-223ffcb55a4f> in <module>
----> 1 auc = roc_auc_score(df["is_recid"], df["decile_score"])
2 auc
/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_ranking.py in roc_auc_score(y_true, y_score, average, sample_weight, max_fpr, multi_class, labels)
379 "instead".format(max_fpr))
380 if multi_class == 'raise':
--> 381 raise ValueError("multi_class must be in ('ovo', 'ovr')")
382 return _multiclass_roc_auc_score(y_true, y_score, labels,
383 multi_class, average, sample_weight)
ValueError: multi_class must be in ('ovo', 'ovr')
both columns are int64.
I tried to implement OVR and calculate per-class roc_auc_score but it didn't work.
Can anyone please try to help out?
Thanks a lot in advance!
Upvotes: 1
Views: 823
Reputation: 1873
From sklearn
documentations link
ROC curves are typically used in binary classification to study the output
of a classifier. In order to extend the ROC curve and ROC area to multi-label
classification, it is necessary to binarize the output.
The ROC (and, therefore, the AUC) may be used only for binary classification. To extend it to multi-class classification, please consider one-vs-all or many-vs-many binarization.
Upvotes: 1