Jonny
Jonny

Reputation: 2783

dplyr join two tables within a function where one variable name is an argument to the function

I am trying to join two tables using dplyr within a function, where one of the variable names is defined by an argument to the function. In other dplyr functions, there is usually a version available for non-standard evaluation, e.g. select & select_, rename and rename_, etc, but not for the _join family. I found this answer, but I cannot get it to work in my code below:

df1 <- data.frame(gender = rep(c('M', 'F'), 5), var1 = letters[1:10])

new_join <- function(df, sexvar){

  df2 <- data.frame(sex = rep(c('M', 'F'), 10), var2 = letters[20:1])

  # initial attempt using usual dplyr behaviour:
  # left_join(df, df2, by = c(sexvar = 'sex'))

  # attempt using NSE:
  # left_join(df, df2, 
  #           by = c(eval(substitute(var), list(var = as.name(sexvar)))) = 'sex'))

  # attempt using setNames:
  # left_join(df, df2, by = setNames(sexvar, 'sex'))

}

new_join(df1, 'gender')

The first and second attempt give the error

Error: 'sexvar' column not found in rhs, cannot join

while the last attempt gives the error

Error: 'gender' column not found in lhs, cannot join,

which at least shows it knows I want the column gender, but somehow doesn't see it as a column heading.

Can anyone point out where I am going wrong?

Upvotes: 3

Views: 4080

Answers (2)

Drey
Drey

Reputation: 3364

A oneliner in your block can look like this (which is similar to your last attempt)

left_join(df, df2, by = structure("sex", names = sexvar))

It is also possible to extend this to two varialbes

left_join(df, df2, by = structure(sexvarDF1, names = sexvarDF2))

Upvotes: 2

Alex Ioannides
Alex Ioannides

Reputation: 1222

Try:

df1 <- data.frame(gender = rep(c('M', 'F'), 5), var1 = letters[1:10])

new_join <- function(df, sexvar){

  df2 <- data.frame(sex = rep(c('M', 'F'), 10), var2 = letters[20:1])

  join_vars <- c('sex')
  names(join_vars) <- sexvar

  left_join(df, df2, by = join_vars)
}

new_join(df1, 'gender')

I'm sure there's a more elegant way of getting this to work using lazy evaluation, etc., but this should get you up-and-running in the meantime.

Upvotes: 4

Related Questions