Reputation: 4650
I have a dataset with two columns y and x.
I performed different algorithms to predict y
based on x
.
For each algorithm, I have a vector of predicting values: p1
, p2
.
I used the function auc
of the package pROC
.
auc (response = test$x, predictor = p1)
auc (response = test$x, predictor = p2)
I have exactly same AUC values at 6 decimals. Is it possible, or something is wrong with my implementation?
Update: p1
and p2
are different.
> pROC::auc (response = test$correct_value, predictor = p1)
Area under the curve: 0.8231
> pROC::auc (response = test$correct_value, predictor = p2)
Area under the curve: 0.8231
> head (p1)
11 14 17 22 25 26
0.01378549 0.01378549 0.01378549 0.01203714 0.01259412 0.01259412
> head (p2)
11 14 17 22 25 26
0.7511921 0.7511921 0.7511921 0.6272434 0.6715637 0.6715637
Upvotes: 1
Views: 175
Reputation: 7969
@Jan van der Laan noted indeed that all(rank(p1) == rank(p2))
. But there is more to it!
If I understand your question correctly, you make predictions with a glm
model based on a single random variable x
. Then the following is true too:
> pROC::auc (response = test$correct_value, predictor = x)
Area under the curve: 0.8231
> all(rank(p1) == rank(x))
[1] TRUE
The reason for this is that a linear function of a single random variable x
cannot possibly reorder the data. As rank is the only relevant information for the ROC analysis, if you want to improve your predictions, you must either pass the data through a non-linear function (nlm
or similar), or introduce more random variables in the equation.
Upvotes: 1