Nick
Nick

Reputation: 195

Iteratively create new columns in r Function

As a disclaimer, I rarely use any apply functions, and this might be the reason why I am struggling with this task.

I would like to create multiple new columns (say 5) within an R dataframe

imagine a dataset representing the daily closing price of the S&P500

df <- data.frame(symbol = rep(S&P_500, n = 5), value =c(2500,2550,2560,2600,2400))

If I wanted (inefficiently) to create 5 new variables, each indicating for each row, the lag value n, (n, 1:5)

I could do :

df %>%
mutate(S&P500_LAG_1 = lag(value, n=1),
       S&P500_LAG_2 = lag(value, n=2),
       S&P500_LAG_3 = lag(value, n=3),
       S&P500_LAG_4 = lag(value, n=4),
       S&P500_LAG_5 = lag(value, n=5))

How could I reduce this, (and also probably make it more efficient) and create a function for this that takes as inputs the symbol (so that I can scale this method to other symbols) and outputs the lag value of the symbol I pass on as an input?

Thanks!

Upvotes: 1

Views: 243

Answers (1)

akrun
akrun

Reputation: 887028

We could use shift from data.table that can take multiple values for n

library(data.table)
setDT(df)[,paste0("S&P500_LAG_", 1:5) :=  shift(value, n = 1:5)]

If there are multiple symbols in a single column, then do a group by symbol

setDT(df)[,paste0("symbol_LAG_", 1:5) :=  shift(value, n = 1:5), by = symbol]

In tidyverse, one approach is to map

library(dplyr)
library(stringr)
library(purrr)
map_dfc(1:5, ~ df %>%
              transmute(!! str_c("S&P500_LAG_", .x) := lag(value, n = .x))) %>%
   bind_cols(df, .)

With multiple symbols

map_dfc(1:5, ~ df %>%
              group_by(symbol) %>%
              transmute(!! str_c("symbol_LAG_", .x) := lag(value, n = .x))) %>%
   bind_cols(df, .)

Upvotes: 3

Related Questions