Reputation: 187
I use ROCR
package to draw the ROC curve. The code is as follows:
pred <- prediction(my.pred, my.label)
perf <- performance(my.pred, 'tpr', 'fpr')
plot(perf,avg="threshold")
My pred
and perf
object is not a vector but a list, so I can get an average ROC curve.
Can anyone tell me how to calculate average sensitivity and specificity at a specified cutoff in ROCR
package?
Upvotes: 2
Views: 7267
Reputation: 31777
Actually, ROCR
is an overkill for this task. The performance
function of ROCR
returns performance metrics at every score that is present in its input. So, theoretically you could do the following:
library(ROCR)
set.seed(123)
N <- 1000
POSITIVE_CASE <- 'case A'
NEGATIVE_CASE <- 'case B'
CUTOFF <- 0.456
scores <- rnorm(n=N)
labels <- ifelse(runif(N) > 0.5, POSITIVE_CASE, NEGATIVE_CASE)
pred <- prediction(scores, labels)
perf <- performance(pred, 'sens', 'spec')
At this point perf
contains a lot of useful information:
> str(perf)
Formal class 'performance' [package "ROCR"] with 6 slots
..@ x.name : chr "Specificity"
..@ y.name : chr "Sensitivity"
..@ alpha.name : chr "Cutoff"
..@ x.values :List of 1
.. ..$ : num [1:1001] 1 1 0.998 0.996 0.996 ...
..@ y.values :List of 1
.. ..$ : num [1:1001] 0 0.00202 0.00202 0.00202 0.00405 ...
..@ alpha.values:List of 1
.. ..$ : num [1:1001] Inf 3.24 2.69 2.68 2.58 ...
Now you can search for your score cut-off in [email protected]
and find the corresponding sensitivity and specificity values. If you don't find the exact cut-off value in [email protected]
, you'll have to do some interpolation:
ix <- which.min(abs([email protected][[1]] - CUTOFF)) #good enough in our case
sensitivity <- [email protected][[1]][ix] #note the order of arguments to `perfomance` and of x and y in `perf`
specificity <- [email protected][[1]][ix]
Which gives you:
> sensitivity
[1] 0.3319838
> specificity
[1] 0.6956522
But there is a much simpler and faster way: just convert your label string to a binary vector and calculate the metrics directly:
binary.labels <- labels == POSITIVE_CASE
tp <- sum( (scores > threshold) & binary.labels )
sensitivity <- tp / sum(binary.labels)
tn <- sum( (scores <= threshold) & (! binary.labels))
specificity <- tn / sum(!binary.labels)
Which gives you:
> sensitivity
[1] 0.3319838
> specificity
[1] 0.6956522
Upvotes: 4