Reputation: 782
I'm trying to do logical regression and I got to the point where I have the probability for each observation. Now I would like to classify the probabilities to either 0 or 1 given a threshold value
For example, if I have two numbers 0.65 and 0.87 and my threshold is 0.7, I'd like to round 0.65 to 0 and 0.87 to 1.
To achieve this, I've tried the following code which I think is too much for such a simple task, and I'd like to know if there's any function dedicated to perform this.
library(tidyverse)
# create a table of probabilities and predictions (0 or 1)
df <- tibble(
prob = runif(20),
pred = round(prob) # threshold = 0.5
)
# threshold function for length = 1
threshold_1 <- function(p,t) {
if (p > t) 1 else 0
}
# threshold function for length = p
threshold_p <- function(ps, t) {
map2_dbl(ps, t, threshold_1)
}
# below works.
df %>% mutate(
pred = threshold_p(df$prob, 0.7)
)
I've also tried this
# threshold = 0.7
df %>%
mutate(
pred = round(prob - 0.2) # threshold = 0.7
)
Above works quite nicely as no probability will be exactly 0 or 1 (as long as we're dealing with distribution functions), so even if I +/- 0.5 to the numbers (to change the threshold value), they will never round to -1 or 2. But it's just that it's not very elegant.
I'd like to know if there is any function that does this in a much simpler way?
Upvotes: 3
Views: 1816
Reputation: 33782
Sounds like ifelse
may do what you want?
library(dplyr)
df %>%
mutate(pred = ifelse(prob < 0.7, 0, 1))
Upvotes: 3