Reputation: 1304
I would like to tabulate multiple variables at once.
I have written the following function that works well on a single variable
iris<-iris
tabulate <- function(data,var1){
data %>%
group_by({{var1}}) %>%
summarise(n = n()) %>%
arrange(-n)%>%
mutate(totalN = (cumsum(n)),
percent = round((n / sum(n)), 3),
cumpercent = round(cumsum(freq = n / sum(n)),3))
}
tabulate(iris,Sepal.Length)
# Sepal.Length n totalN percent cumpercent
# <dbl> <int> <int> <dbl> <dbl>
# 1 5 10 10 0.067 0.067
# 2 5.1 9 19 0.06 0.127
# 3 6.3 9 28 0.06 0.187
# 4 5.7 8 36 0.053 0.24
I would like to iterate this over a series of variables using map()
.
I have tried the following, but it gives me an error because it cannot find x
.
var<-c("Sepal.Length","Sepal.Width","Petal.Length")
map(var,~tabulate(iris,.x))
I know I can use get(var1)
in writing the function, and it will work, but the output is slightly different and makes it harder to understand which variable it is referring to.
tabulate_get <- function(data,var1){
data %>%
group_by(get(var1)) %>%
summarise(n = n()) %>%
arrange(-n)%>%
mutate(totalN = (cumsum(n)),
percent = round((n / sum(n)), 3),
cumpercent = round(cumsum(freq = n / sum(n)),3))
}
map(var,~tabulate_get(iris,.x))
# Note that the output prints (`get(var1)`) rather than the name of the variable used. which makes interpretation harder
# `get(var1)` n totalN percent cumpercent
# <dbl> <int> <int> <dbl> <dbl>
# 1 5 10 10 0.067 0.067
# 2 5.1 9 19 0.06 0.127
# 3 6.3 9 28 0.06 0.187
# 4 5.7 8 36 0.053 0.24
Is there a concise way to use map() indexing the variable without using get? Alternatively I could list my variable differently?
thanks a lot for your help
Upvotes: 1
Views: 41
Reputation: 39858
You can do:
tabulate <- function(data, var) {
data %>%
group_by(across(all_of(var))) %>%
summarise(n = n()) %>%
arrange(-n) %>%
mutate(totalN = (cumsum(n)),
percent = round((n/sum(n)), 3),
cumpercent = round(cumsum(freq = n/sum(n)),3))
}
map(.x = var, ~ tabulate(iris, .x))
[[1]]
# A tibble: 5 x 5
Sepal.Length n totalN percent cumpercent
<dbl> <int> <int> <dbl> <dbl>
1 5 10 10 0.067 0.067
2 5.1 9 19 0.06 0.127
3 6.3 9 28 0.06 0.187
4 5.7 8 36 0.053 0.24
5 6.7 8 44 0.053 0.293
[[2]]
# A tibble: 5 x 5
Sepal.Width n totalN percent cumpercent
<dbl> <int> <int> <dbl> <dbl>
1 3 26 26 0.173 0.173
2 2.8 14 40 0.093 0.267
3 3.2 13 53 0.087 0.353
4 3.4 12 65 0.08 0.433
5 3.1 11 76 0.073 0.507
[[3]]
# A tibble: 5 x 5
Petal.Length n totalN percent cumpercent
<dbl> <int> <int> <dbl> <dbl>
1 1.4 13 13 0.087 0.087
2 1.5 13 26 0.087 0.173
3 4.5 8 34 0.053 0.227
4 5.1 8 42 0.053 0.28
5 1.3 7 49 0.047 0.327
Upvotes: 2