Reputation: 1431
I'm using the following script to make a table in R:
library(dplyr)
library(tidyr)
get_probability <- function(parameter_array, threshold) {
return(round(100 * sum(parameter_array >= threshold) /
length(parameter_array)))
}
thresholds = c(75, 100, 125)
mtcars %>% group_by(gear) %>%
dplyr::summarise(
low=get_probability(disp, thresholds[[1]]),
medium=get_probability(disp, thresholds[[2]]),
high=get_probability(disp, thresholds[[3]]),
)
The table that comes out is the following:
# A tibble: 3 x 4
gear low medium high
<dbl> <dbl> <dbl> <dbl>
1 3 100 100 93
2 4 92 67 50
3 5 100 80 60
My question is, how can condense what I have passed to summarise
to a single line? i.e., is there a way to iterate over both the thresholds
vector, also while passing custom variable names?
Upvotes: 2
Views: 62
Reputation: 28675
In recent versions of dplyr
, summarise
will auto-splice data.frames created within it into new columns. So, you just need a way to iterate over thresholds to create a data.frame.
One option is purrr:::map_dfc
.
library(dplyr, warn.conflicts = FALSE)
get_probability <- function(parameter_array, threshold) {
return(round(100 * sum(parameter_array >= threshold) /
length(parameter_array)))
}
thresholds = c(75, 100, 125)
thresholds <- setNames(thresholds, c('low', 'medium', 'high'))
mtcars %>%
group_by(gear) %>%
summarise(purrr::map_dfc(thresholds, ~ get_probability(disp, .x)))
#> # A tibble: 3 × 4
#> gear low medium high
#> <dbl> <dbl> <dbl> <dbl>
#> 1 3 100 100 93
#> 2 4 92 67 50
#> 3 5 100 80 60
If you prefer not to use an extra package though, you could just lapply
and then convert the output to data.frame. (Replace \(x)
with function(x)
in older versions of R)
mtcars %>%
group_by(gear) %>%
summarise(as.data.frame(lapply(thresholds, \(x) get_probability(disp, x))))
#> # A tibble: 3 × 4
#> gear low medium high
#> <dbl> <dbl> <dbl> <dbl>
#> 1 3 100 100 93
#> 2 4 92 67 50
#> 3 5 100 80 60
Created on 2021-08-17 by the reprex package (v2.0.1)
Upvotes: 4