Rahul
Rahul

Reputation: 2789

Pass a string argument to a function as dataframe column name in dplyr

I am trying to pass a string variable to a function, to be used as the column name after some data alteration.

Here is the function:

cleandata <- function(df,name){
  df <- df %>%
    gather(key = 'Year',value = name,X1960:X2015)
  df <- df %>%
    select(-c(X,Indicator.Name,Indicator.Code))
  df$Year <- substr(df$Year,start = 2,stop = 5)
  df$Year <-  as.factor(df$Year)
  return(df)
}

I want to pass a string variable to 'name', and have it as the column name.

The current output of the function is:

> cleandata(lifeexp,'LifeExp')
Source: local data frame [13,888 x 4]

           Country.Name Country.Code   Year     name
                 (fctr)       (fctr) (fctr)    (dbl)
1                 Aruba          ABW   1960 65.56937
2               Andorra          AND   1960       NA
3           Afghanistan          AFG   1960 32.32851
4                Angola          AGO   1960 32.98483
5               Albania          ALB   1960 62.25437
6            Arab World          ARB   1960 46.84706
7  United Arab Emirates          ARE   1960 52.24322
8             Argentina          ARG   1960 65.21554
9               Armenia          ARM   1960 65.86346
10       American Samoa          ASM   1960       NA
..                  ...          ...    ...      ...
> 

The last column should be 'LifeExp', not name. What am I missing?

Thanks in advance,

Rahul

Upvotes: 3

Views: 1265

Answers (1)

Matthew Plourde
Matthew Plourde

Reputation: 44634

You want to use gather_ here. See vignette('nse') for an explanation why.

year_cols <- names(df)[grepl('^X\\d{4}$', names(df))]
df %>% gather_('Year', name, year_cols)

The issue is gather takes an unquoted name for its key and value columns, so you can't pass in a variable name. It's just going to interpret what ever variable name you put in there as the the unquoted name you want for the value column. This is consistent with the principle that the tidyr functions without underscores are meant for interactive use and those with underscores should be used when your effort is more programmatic.

Upvotes: 3

Related Questions