mommomonthewind
mommomonthewind

Reputation: 4650

Different parameters, exactly same AUC

I have a dataset with two columns y and x.

I performed different algorithms to predict y based on x.

For each algorithm, I have a vector of predicting values: p1, p2.

I used the function auc of the package pROC.

auc (response = test$x, predictor = p1)
auc (response = test$x, predictor = p2)

I have exactly same AUC values at 6 decimals. Is it possible, or something is wrong with my implementation?

Update: p1 and p2 are different.

> pROC::auc (response = test$correct_value, predictor = p1)
Area under the curve: 0.8231
> pROC::auc (response = test$correct_value, predictor = p2)
Area under the curve: 0.8231
> head (p1)
        11         14         17         22         25         26 
0.01378549 0.01378549 0.01378549 0.01203714 0.01259412 0.01259412 
> head (p2)
       11        14        17        22        25        26 
0.7511921 0.7511921 0.7511921 0.6272434 0.6715637 0.6715637

Upvotes: 1

Views: 175

Answers (1)

Calimo
Calimo

Reputation: 7969

@Jan van der Laan noted indeed that all(rank(p1) == rank(p2)). But there is more to it!

If I understand your question correctly, you make predictions with a glm model based on a single random variable x. Then the following is true too:

> pROC::auc (response = test$correct_value, predictor = x)
Area under the curve: 0.8231
> all(rank(p1) == rank(x))
[1] TRUE

The reason for this is that a linear function of a single random variable x cannot possibly reorder the data. As rank is the only relevant information for the ROC analysis, if you want to improve your predictions, you must either pass the data through a non-linear function (nlm or similar), or introduce more random variables in the equation.

Upvotes: 1

Related Questions