Sam Pickwick
Sam Pickwick

Reputation: 313

Column reference data.table function R

I'm trying to make a function that calls to a column in the data table supplied as one of the arguments below:

df <- read.table(text = "x1 x2 y
CA 20 50
CA 30.5 100
CA 40.5 200
AZ 20.12 400
AZ 25 500
OR 86 600
OR 75 700
OR 45 800", header = TRUE)

df$x1 <- as.factor(df$x1)

library(data.table)

make_freq <- function(df, var_name){
  
  df <- df 
  setDT(df)
  
  
  tb <- df[, .N, by = var_name][,prop_ := round(((N/sum(N))*100), digits = 0)][order(var_name)]
  
  gg1 <- ggplot(tb, aes(x = var_name, y = prop_)) +
    geom_bar(width = .35, stat = "identity", color = "darkblue", fill = "darkblue") +
    ggtitle(paste0("var_name")) +
    theme_bw() +
    theme(plot.title = element_text(size = 10)) +
    theme(axis.text.x = element_text(angle = 45)) 
  
  return(list(figure = gg1))
}

make_freq(df = df, var_name = x1)

Ideally I want to be able to run the function so that I can create the ggplot figure for any categorical variable I want using the var_name argument. I'm getting Object x1 not found error which makes me think I need to quote or unquote the var_name argument within the function or something.

Upvotes: 0

Views: 66

Answers (2)

cazman
cazman

Reputation: 1492

Yes, if you would like to use non-standard evaluation you will need to quote the var_name argument. Simply add:

var_name <- substitute(var_name)

to the top of the function. Note that the default x axis label in this case will be var_name. If you would like it to default to whatever is passed as var_name you will need to do a couple of extra steps. Change the top of the function to:

  x <- enquo(var_name)
  var_name <- substitute(var_name)

Then modify the tb line.

  tb <- df[, .N, by = eval(deparse(var_name))][,prop_ := round(((N/sum(N))*100), digits = 0)][order(eval(var_name))]

Then in ggplot():

gg1 <- ggplot(tb, aes(x = !!x, y = prop_)) + ...

Upvotes: 2

  1. You should to quote the x1 cause you have no this object (this is the name of column).
  2. Argument by in data.table object may be character and df[, .N, by = var_name] is good code. But [order(var_name)] is wrong. You can use [order(get(var_name))].
  3. Cause var_name is character we need to change var_name to get(var_name) in ggplot.

Full code:

make_freq <- function(df, var_name){
    
    df <- df 
    setDT(df)
    
    
    tb <- df[, .N, by = var_name][,prop_ := round(((N/sum(N))*100), digits = 0)][order(get(var_name))]
    
    gg1 <- ggplot(tb, aes(x = get(var_name), y = prop_)) +
        geom_bar(width = .35, stat = "identity", color = "darkblue", fill = "darkblue") +
        ggtitle(paste0("var_name")) +
        theme_bw() +
        theme(plot.title = element_text(size = 10)) +
        theme(axis.text.x = element_text(angle = 45)) 
    
    return(list(figure = gg1))
}


make_freq(df = df, var_name = "x1")

Upvotes: 1

Related Questions