Please have a look at the reprex at the end of the post. I slighly modified a sophisticated (for me!) function to calculated lagged variables and add them to an existing tibble while giving them a custom name. The ideas come from
However, when I try to apply this function to all the numeric variables in a tibble, I get an error message and I bang my head against the wall. Any suggestion is appreciated.
df <- tibble(x=LETTERS[1:20], y=1:20, z=31:50)
df |> glimpse()
#> Rows: 20
#> Columns: 3
#> $ x <chr> "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N"…
#> $ y <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
#> $ z <int> 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, …
## See
mylags <- function(data, variable, n=10){
variable <- enquo(variable)
indices <- seq_len(n)
quosures <- map( indices, ~quo(lag(!!variable, !!.x)) ) %>%
set_names(sprintf("lag_%s_%02d", quo_text(variable), indices))
mutate( data, !!!quosures )
df_lag <- df |>
df_lag |> glimpse() ##this works as intended, but...
#> Rows: 20
#> Columns: 6
#> $ x <chr> "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "…
#> $ y <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
#> $ z <int> 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 4…
#> $ lag_y_01 <int> NA, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
#> $ lag_y_02 <int> NA, NA, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16…
#> $ lag_y_03 <int> NA, NA, NA, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15…
df_lag2 <- df |>
mutate(across(where(is.numeric), ~mylags(.x,3))) ##but this fails
#> Error in `mutate()`:
#> ℹ In argument: `across(where(is.numeric), ~mylags(.x, 3))`.
#> Caused by error in `across()`:
#> ! Can't compute column `y`.
#> Caused by error in `UseMethod()`:
#> ! no applicable method for 'mutate' applied to an object of class "c('integer', 'numeric')"
## and I do not understand why.
#> Error: Can't show last error because no error was recorded yet
Created on 2024-03-06 with reprex v2.1.0
Custom names can be generated quite flexibly with the .names
argument of across
An example combining the source column and the function applied:
iris |>
select(starts_with('Sepal')) |>
.fns = list(lag_3 = \(xs) lag(xs, 3),
lag_10 = \(xs) lag(xs, 10)
.names = "{.col}_{.fn}"
) |>
## Sepal.Length Sepal.Width Sepal.Length_lag_3 Sepal.Length_lag_10
## 150 5.9 3 6.3 6.9
## Sepal.Width_lag_3 Sepal.Width_lag_10
## 150 2.5 3.1
If there's only one function to be applied (only source column varies), the expression reduces to, e. g.:
iris |>
select(starts_with('Sepal')) |>
~ lag(.x, 3),
.names = "{.col}_lag3"
) |>
## Sepal.Length Sepal.Width Sepal.Length_lag3 Sepal.Width_lag3
## 150 5.9 3 6.3 2.5
