Reputation: 95
Let's say I have a vector of probabilities
> probs <- c(0.2, 0.3, 0.5, 0.7, 0.8, 0.9)
> probs
[1] 0.2 0.3 0.5 0.7 0.8 0.9
I want to classify each element as positive or negative by comparing it to some threshold value (for sake of argument let's say that element with probability >= threshold will be classified as positive, otherwise it is considered negative). I don't know what value of threshold I want to use, but I know I want exactly 3 elements to be classified as positive.
My own solution would be to go over all probabilities and try to use each one as a threshold value and test if it would result in the desired number of positives.
> sum(probs >= 0.2)
[1] 6
> sum(probs >= 0.3)
[1] 5
> sum(probs >= 0.5)
[1] 4
> sum(probs >= 0.7)
[1] 3
Is there any function in R (libraries included) that would offer that functionality out-of-the-box?
EDIT: This problem has a rather straightforward solution (thus making a dedicated function obsolete), therefore I will accept the top solution, even though it doesn't answer the question
Upvotes: 0
Views: 195
Reputation: 389355
You can sort
the vector in decreasing order and select n
th value
n <- 3
sort(probs, decreasing = TRUE)[n]
#[1] 0.7
with order
probs[order(-probs)[n]]
#[1] 0.7
Upvotes: 1