Sam
Sam

Reputation: 99

In R pass a column of dataframe into a function and manipulate it

I have a dataframe with a specific column that I want to analyze within a function using dplyr, but I cannot figure out how to represent the passed-in column such that R will accept it. A prior discussion of passing a column into a function does not seem to address this exact issue. Suppose the passed-in dataframe df has a column called ID and a second column called x (type boolean). I want to return a frequency table of the boolean variable.

My code is below:

# function
calculate_frequency = function(df, x) {
  subset_df = df %>% group_by(ID) %>% distinct(x)
  frequency_table = as.data.frame(table(subset_df$x)) 
}

# call to function
frequency_table = calculate_frequency(df, "name_of_boolean_column")

The error I get is "Unknown or uninitialised column: 'x'." I also have tried to represent x within brackets as in df[ , x], but that does not work either.

Thank you for any help!

Upvotes: 2

Views: 593

Answers (1)

akrun
akrun

Reputation: 886948

If we are passing a string, then use the sym from rlang

calculate_frequency = function(df, x) {
    subset_df <-  df %>%
                    group_by(ID) %>%
                    distinct(!!rlang::sym(x))
                    #or
                    #distinct(get(x))
    frequency_table <- as.data.frame(table(subset_df[[x]])) 
    frequency_table
 }

calculate_frequency(df1, 'booleanCol')
#  Var1 Freq
#1    0    5
#2    1    4

data

set.seed(24)
df1 <- data.frame(ID = rep(1:5, each = 10), booleanCol = sample(0:1, 50, replace = TRUE))

Upvotes: 1

Related Questions