referring to a data frame inside a function in R

Question

I have a very large data frame of the following format

uniqueID	year	header_1	header_2	c
0001	1990	x	TRUE
0002	1990	y	FALSE	other data
0003	1995	x	FALSE

I can filter, summarise, and rearrange it like this:

    new_df <- filter(df, year %in% c(1990))

    count_new_df <- group_by(new_df, header_1, header_2) %>%
      summarise(count = n())

    count_wide <- count_new_df %>% pivot_wider(names_from = header_1, values_from = count)

If I run this as explicit code it works perfectly. However, if I try to write a function where d = the starting df, y = the year of data I want to see, and I insert variables a, b for the column headers, it breaks

    slice <- function (d,y,a,b) {
       t <- filter(d, year %in% c(y))
       c <- group_by(t, a, b) %>%
         summarise(count = n())

       c2 <- c %>% pivot_wider(names_from = a, values_from = count)

      }

with the error message: must group by variables found in ' .data', column 'a' is not found, column 'b' is not found.

If I change to calling d$a and d$b I get object 'a' not found. I also tried group_by(t, t$a, t$b) and that didn't work either. What am I missing? There must be some way to call the columns of a df created inside a function.

TIA

Ronak Shah · Accepted Answer

You can use {{}} to refer to columns inside the function :

library(tidyverse)

new_slice <- function (d,y,a,b) {
  t <- filter(d, year %in% y)
  c <- group_by(t, {{a}}, {{b}}) %>% summarise(count = n())
  #Can also use count
  #c <- count(t, {{a}}, {{b}}, name = 'count')
  c2 <- c %>% pivot_wider(names_from = {{a}}, values_from = count)
  c2
}

new_slice(d, 1990, header_1, header_2)

referring to a data frame inside a function in R

Answers (1)

Related Questions