Reputation: 110
I would like to create a new variable which is an interval, but as there are many intervals i want to know if i can write the code more concisely. I expect a for loop or a function might do the trick, but for now i have come up with:
require(dplyr)
mtcars %>%
mutate(
mpg_interval = if_else(mpg < 15, "<15",
if_else(mpg < 20, "15-19",
if_else(mpg < 25, "20-24",
">24")))
)
Is there an easier way to make many intervals (i.e. 100, which would be impractical to write out) using dplyr
commands?
Upvotes: 2
Views: 1469
Reputation: 7248
As @Aramis7d and @Florian point out in the comments above, cut
is the tool for the job. If the number of intervals is too long to write out, then cut
can be used with seq
.
Consider
df <- data.frame(x=1:100)
df %>% mutate(rg=cut(x,c(seq(0, 25, 5), Inf))) %>%
group_by(rg) %>% summarise(c = n())
# A tibble: 6 × 2
rg c
<fctr> <int>
1 (0,5] 5
2 (5,10] 5
3 (10,15] 5
4 (15,20] 5
5 (20,25] 5
6 (25,Inf] 75
Upvotes: 2
Reputation: 11728
I think what you need is case_when()
:
mtcars %>%
mutate(
mpg_interval = case_when(
mpg < 15 ~ "<15",
mpg < 20 ~ "15-19",
mpg < 25 ~ "20-24",
TRUE ~ ">24"
)
)
Upvotes: 5