Reputation: 2071
I'm trying to create a function that creates a variable. Like this:
Add_Extreme_Variable <- function(dataframe, variable, variable_name){
dataframe %>%
group_by(cod_station, year_station) %>%
mutate(variable_name= ifelse(variable > quantile(variable, 0.95, na.rm=TRUE),1,0)) %>%
ungroup() %>%
return()
}
df <- Add_Extreme_Variable (df, rain, extreme_rain)
df
is the dataframe I'm working with, rain
is a numeric variable in df
, and extreme_rain
is the name of the variable I want to create.
If I use mutate_()
everything works well, but the problem it's deprecated. However, the solutions I have found in stackoverflow (1, 2, 3) and the vignette doesn't seem to fit my problem or it seems far more complicated than I need it to be, as I cannot find good examples about how to work with quo()
, !!
without space, !!
with space, how to replace =
for :=
, and I don't know if working with them at all will solve the problem I have or it's even necessary as the ultimate goal doing this function is to make the code cleaner. Any suggestions?
Upvotes: 0
Views: 422
Reputation: 28331
You can use {{ }}
(curly curly). See Tidy evaluation section in Hadley Wickham's Advanced R book. Below is an example using the gapminder
dataset.
library(gapminder)
library(rlang)
library(tidyverse)
Add_Extreme_Variable2 <- function(dataframe, group_by_var1, group_by_var2, variable, variable_name) {
res <- dataframe %>%
group_by({{group_by_var1}}, {{group_by_var2}}) %>%
mutate({{variable_name}} := ifelse({{variable}} > quantile({{variable}}, 0.95, na.rm = TRUE), 1, 0)) %>%
ungroup()
return(res)
}
df <- Add_Extreme_Variable2(gapminder, continent, year, pop, pop_extreme) %>%
arrange(desc(pop_extreme))
df
#> # A tibble: 1,704 x 7
#> country continent year lifeExp pop gdpPercap pop_extreme
#> <fct> <fct> <int> <dbl> <int> <dbl> <dbl>
#> 1 Australia Oceania 1952 69.1 8691212 10040. 1
#> 2 Australia Oceania 1957 70.3 9712569 10950. 1
#> 3 Australia Oceania 1962 70.9 10794968 12217. 1
#> 4 Australia Oceania 1967 71.1 11872264 14526. 1
#> 5 Australia Oceania 1972 71.9 13177000 16789. 1
#> 6 Australia Oceania 1977 73.5 14074100 18334. 1
#> 7 Australia Oceania 1982 74.7 15184200 19477. 1
#> 8 Australia Oceania 1987 76.3 16257249 21889. 1
#> 9 Australia Oceania 1992 77.6 17481977 23425. 1
#> 10 Australia Oceania 1997 78.8 18565243 26998. 1
#> # ... with 1,694 more rows
summary(df)
#> country continent year lifeExp
#> Afghanistan: 12 Africa :624 Min. :1952 Min. :23.60
#> Albania : 12 Americas:300 1st Qu.:1966 1st Qu.:48.20
#> Algeria : 12 Asia :396 Median :1980 Median :60.71
#> Angola : 12 Europe :360 Mean :1980 Mean :59.47
#> Argentina : 12 Oceania : 24 3rd Qu.:1993 3rd Qu.:70.85
#> Australia : 12 Max. :2007 Max. :82.60
#> (Other) :1632
#> pop gdpPercap pop_extreme
#> Min. :6.001e+04 Min. : 241.2 Min. :0.00000
#> 1st Qu.:2.794e+06 1st Qu.: 1202.1 1st Qu.:0.00000
#> Median :7.024e+06 Median : 3531.8 Median :0.00000
#> Mean :2.960e+07 Mean : 7215.3 Mean :0.07042
#> 3rd Qu.:1.959e+07 3rd Qu.: 9325.5 3rd Qu.:0.00000
#> Max. :1.319e+09 Max. :113523.1 Max. :1.00000
#>
Created on 2019-11-10 by the reprex package (v0.3.0)
Upvotes: 4
Reputation: 388862
We can use rlang
s curly curly ({{}}
) operator along with enquo
to add new columns with unquoted inputs passed.
library(dplyr)
library(rlang)
Add_Extreme_Variable <- function(dataframe, variable, variable_name){
col_name <- enquo(variable_name)
dataframe %>%
group_by(cyl, am) %>%
mutate(!!col_name := as.integer({{variable}} >
quantile({{variable}}, 0.95, na.rm=TRUE))) %>%
ungroup()
}
Add_Extreme_Variable(mtcars, mpg, new)
# A tibble: 32 x 12
# mpg cyl disp hp drat wt qsec vs am gear carb new
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
# 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 0
# 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4 0
# 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 0
# 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1 1
# 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 0
# 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1 0
# 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4 0
# 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 1
# 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 0
#10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4 0
# … with 22 more rows
Upvotes: 2