Indexing in user written functions to iterate over multiple variables using map

Question

I would like to tabulate multiple variables at once.

I have written the following function that works well on a single variable

iris<-iris

tabulate <- function(data,var1){
  data %>%
    group_by({{var1}}) %>%
    summarise(n = n()) %>%
    arrange(-n)%>%
    mutate(totalN = (cumsum(n)),
           percent = round((n / sum(n)), 3),
           cumpercent = round(cumsum(freq = n / sum(n)),3)) 
}
tabulate(iris,Sepal.Length)
# Sepal.Length     n totalN percent cumpercent
#             
# 1          5      10     10   0.067      0.067
# 2          5.1     9     19   0.06       0.127
# 3          6.3     9     28   0.06       0.187
# 4          5.7     8     36   0.053      0.24

I would like to iterate this over a series of variables using map(). I have tried the following, but it gives me an error because it cannot find x.

var<-c("Sepal.Length","Sepal.Width","Petal.Length")
map(var,~tabulate(iris,.x))

I know I can use get(var1) in writing the function, and it will work, but the output is slightly different and makes it harder to understand which variable it is referring to.

tabulate_get <- function(data,var1){
  data %>%
    group_by(get(var1)) %>%
    summarise(n = n()) %>%
    arrange(-n)%>%
    mutate(totalN = (cumsum(n)),
           percent = round((n / sum(n)), 3),
           cumpercent = round(cumsum(freq = n / sum(n)),3)) 
}
map(var,~tabulate_get(iris,.x))

# Note that the output prints (`get(var1)`) rather than the name of the variable used. which makes interpretation harder
# `get(var1)`     n totalN percent cumpercent
#             
# 1         5      10     10   0.067      0.067
# 2         5.1     9     19   0.06       0.127
# 3         6.3     9     28   0.06       0.187
# 4         5.7     8     36   0.053      0.24

Is there a concise way to use map() indexing the variable without using get? Alternatively I could list my variable differently?

thanks a lot for your help

tmfmnk · Accepted Answer

You can do:

tabulate <- function(data, var) {
 data %>%
  group_by(across(all_of(var))) %>%
  summarise(n = n()) %>%
  arrange(-n) %>%
  mutate(totalN = (cumsum(n)),
         percent = round((n/sum(n)), 3),
         cumpercent = round(cumsum(freq = n/sum(n)),3)) 
}

map(.x = var, ~ tabulate(iris, .x))

[[1]]
# A tibble: 5 x 5
  Sepal.Length     n totalN percent cumpercent
                     
1          5      10     10   0.067      0.067
2          5.1     9     19   0.06       0.127
3          6.3     9     28   0.06       0.187
4          5.7     8     36   0.053      0.24 
5          6.7     8     44   0.053      0.293

[[2]]
# A tibble: 5 x 5
  Sepal.Width     n totalN percent cumpercent
                    
1         3      26     26   0.173      0.173
2         2.8    14     40   0.093      0.267
3         3.2    13     53   0.087      0.353
4         3.4    12     65   0.08       0.433
5         3.1    11     76   0.073      0.507

[[3]]
# A tibble: 5 x 5
  Petal.Length     n totalN percent cumpercent
                     
1          1.4    13     13   0.087      0.087
2          1.5    13     26   0.087      0.173
3          4.5     8     34   0.053      0.227
4          5.1     8     42   0.053      0.28 
5          1.3     7     49   0.047      0.327

Indexing in user written functions to iterate over multiple variables using map

Answers (1)

Related Questions