Anjith
Anjith

Reputation: 2308

sklearn classification metric auc return ValueError

I'm building a two class classification model using KNN

I tried to calculate auc_score with

from sklearn.metrics import auc

auc(y_test, y_pred)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-183-980dc3c4e3d7> in <module>
----> 1 auc(y_test, y_pred)

~/.local/lib/python3.6/site-packages/sklearn/metrics/ranking.py in auc(x, y, reorder)
    117             else:
    118                 raise ValueError("x is neither increasing nor decreasing "
--> 119                                  ": {}.".format(x))
    120 
    121     area = direction * np.trapz(y, x)

ValueError: x is neither increasing nor decreasing : [1 1 1 ... 1 1 1].

Then I used roc_auc_score

from sklearn.metrics import roc_auc_score
roc_auc_score(y_test, y_pred)
0.5118361429056588

Why is it auc is not working where as roc_auc_score is working. I though they both were same? What am I missing here?

Here y_test is actual target values and y_pred is my predicted values.

Upvotes: 6

Views: 9895

Answers (3)

MrE
MrE

Reputation: 20768

the short answer is:

for auc, you need to sort the input arrays

Upvotes: 0

OmG
OmG

Reputation: 18838

They are different in implementation and meaning:

auc:

Compute Area Under the Curve (AUC) using the trapezoidal rule. This is a general function, given points on a curve.

roc_auc_score:

Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores.

It means auc is more general than roc_auc_score, although you can get the same value of roc_auc_curve from auc. Hence, the input parameter of the auc is the x and y coordinates of the specified curve, and your error comes from the difference in types of necessary input! Also, the x and y must be in an increasing or decreasing order.

Upvotes: 4

ssh
ssh

Reputation: 167

AUC is used most of the time to mean AUROC, which is a bad practice since as Marc Claesen pointed out AUC is ambiguous (could be any curve) while AUROC is not.

  • For binary classification you need to use the metric ROC AUC not Area under Curve.

As for why the value error occurs in AUC is due to the following error

x is neither increasing nor decreasing : [1 1 1 ... 1 1 1]

The auc metric uses trapezoid rule to approximate the area under curve and trapeziod rule requires regular interval sampled function i.e it requires input as following for a function y = exp(x^2)

X : 0.0, 0.1, 0.2, 0.3, 0.4

Y : 1.00000 1.01005 1.04081 1.09417 1.17351

Therefore X should be either monotonic increasing or monotonic decreasing and Y is just the output of the function at that point.

Upvotes: 2

Related Questions