Malta
Malta

Reputation: 1993

how to parameter a variable name using dplyr

I am sorry if my title isn't great, if anyone have a better one, please enhance it.

I am currently struggling with the new dplyr 0.7 that enable to use variable inside my own functions using dplyr. I don't know if what I try isn't possible or if I take it the wrong way.

I have a simple data frame whith one line per company, and for each company a bunch of variables for the year 2015 and 2014 :

evo <- data.frame(id=1:5,
              CA2015 = c(1200,1500,1550,200,0),
              CA2014 = c(800,50,654,8555,0),
              VA2015 = c(6984,6588,633,355,84),
              VA2014 = c(35,358,358,1331,86))

I would like to be able to create, for each of my variable, an "evo" variable which is basically the value of 2015 minus the value of 2013.

In my example, I want to get a variable named evoCA which is equal to CA2015 - CA2014, and a variable named evoVA which is equal to VA2015 - VA2014.

(basically :

evo %>%
  mutate(evoCA= CA2015 - CA2014,
     evoVA = VA2015 - VA2014)

)

So I tried to make a function that would take into parameter the name of the variable I want to diff, but I can't make it work. Here's the best I can do :

addEvo <- function(table,var,var2014,var2015) {
var <- enquo(var)
var2014 <- enquo(var2014)
var2015 <- enquo(var2015)
evoName <- paste0("evo",var)[2]
table %>%
 mutate(!!evoNom := (!!var2015) - (!!var2013)) %>%
  return()

}

but it's not very clean because I have to pass not only the variable name, but the variable with its suffixes.

Any idea to make it better ?

Upvotes: 2

Views: 131

Answers (2)

cderv
cderv

Reputation: 6552

using tidyr you could transform your data to something more tidy.

If the years are always the same two years and the prefix is always before the years, something like like could add all the columns you want:

evo <- data.frame(id=1:5,
                  CA2015 = c(1200,1500,1550,200,0),
                  CA2014 = c(800,50,654,8555,0),
                  VA2015 = c(6984,6588,633,355,84),
                  VA2014 = c(35,358,358,1331,86))
library(dplyr, warn.conflicts = F)
library(tidyr, warn.conflicts = F)
evo %>%
  gather(key = "type", value = "value", -id) %>%
  separate(type, c("prefix", "year"), sep = -5) %>%
  spread(year, value) %>%
  mutate(evo = `2015` - `2014`) %>%
  gather(key = "key", value = "value", -(id:prefix)) %>%
  unite("type", prefix:key, sep = "") %>%
  spread(type, value) %>%
  select(id, ends_with("2015"), ends_with("2014"), ends_with("evo"))
#>   id CA2015 VA2015 CA2014 VA2014 CAevo VAevo
#> 1  1   1200   6984    800     35   400  6949
#> 2  2   1500   6588     50    358  1450  6230
#> 3  3   1550    633    654    358   896   275
#> 4  4    200    355   8555   1331 -8355  -976
#> 5  5      0     84      0     86     0    -2

if the two years are different than 2014 and 2015, you could also make a function.

Upvotes: 1

Stanislaus Stadlmann
Stanislaus Stadlmann

Reputation: 590

Here's a solution that does not require writing a new function:

library(dplyr)
evo <- data.frame(id=1:5,
                  CA2015 = c(1200,1500,1550,200,0),
                  CA2014 = c(800,50,654,8555,0),
                  VA2015 = c(6984,6588,633,355,84),
                  VA2014 = c(35,358,358,1331,86))

# This creates two dataframes with the same properties and two diff years
evo2014 <- evo %>%
  select(contains("2014"))
evo2015 <- evo %>%
  select(contains("2015"))

# If both have the same amount of columns make a matrix subtraction and obtain the difference. Then bind the difference dataframe to the original df
if (ncol(evo2014) == ncol(evo2015)) {
  evodiff <- as.matrix(evo2015 - evo2014)
  varnames <- substr(colnames(evo2014), 1, 2)
  colnames(evodiff) <- paste0("evo", varnames)
  evo <- bind_cols(evo, as.data.frame(evodiff))
}
evo
#>   id CA2015 CA2014 VA2015 VA2014 evoCA evoVA
#> 1  1   1200    800   6984     35   400  6949
#> 2  2   1500     50   6588    358  1450  6230
#> 3  3   1550    654    633    358   896   275
#> 4  4    200   8555    355   1331 -8355  -976
#> 5  5      0      0     84     86     0    -2

Upvotes: 2

Related Questions