Adrian
Adrian

Reputation: 9793

R: how to pass a variable into a function to subset data.frame

 dat = data.frame(height = c(20, 20, 40, 50, 60, 10), weight = c(100, 200, 300, 200, 140, 240),
             age = c(19, 20, 20, 19, 10, 11))
 f = function(x){
   subset.19 = dat$x[dat$age == 20]
   subset.20 = dat$x[dat$age == 19]
   t.test(subset.19, subset.20)  
 }
 f("weight")

I get an error:

Error in var(x) : 'x' is NULL In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 2: In mean.default(x) : argument is not numeric or logical: returning NA

I think this is because dat$x is always NULL because there is no column named x in the data.frame. I think I am not passing the variable name into the function. dat$x is always subsetting the column named x from dat, not the column name that I passed in (i.e. weight). So my question is how can I pass in the column name that I want so this function runs?

Upvotes: 2

Views: 4164

Answers (1)

akrun
akrun

Reputation: 887048

As @agstudy and @docendodiscimus mentioned in the comments, it is better to use [, [[ instead of $ when passing column name in functions.

 f <- function(x){
     subset.19 = dat[,x][dat$age == 20]
     subset.20 = dat[,x][dat$age == 19]
   t.test(subset.19, subset.20)  
 }
f("weight")

Upvotes: 4

Related Questions