Reputation: 195
As a disclaimer, I rarely use any apply functions, and this might be the reason why I am struggling with this task.
I would like to create multiple new columns (say 5) within an R dataframe
imagine a dataset representing the daily closing price of the S&P500
df <- data.frame(symbol = rep(S&P_500, n = 5), value =c(2500,2550,2560,2600,2400))
If I wanted (inefficiently) to create 5 new variables, each indicating for each row, the lag value n, (n, 1:5)
I could do :
df %>%
mutate(S&P500_LAG_1 = lag(value, n=1),
S&P500_LAG_2 = lag(value, n=2),
S&P500_LAG_3 = lag(value, n=3),
S&P500_LAG_4 = lag(value, n=4),
S&P500_LAG_5 = lag(value, n=5))
How could I reduce this, (and also probably make it more efficient) and create a function for this that takes as inputs the symbol (so that I can scale this method to other symbols) and outputs the lag value of the symbol I pass on as an input?
Thanks!
Upvotes: 1
Views: 243
Reputation: 887028
We could use shift
from data.table
that can take multiple values for n
library(data.table)
setDT(df)[,paste0("S&P500_LAG_", 1:5) := shift(value, n = 1:5)]
If there are multiple symbol
s in a single column, then do a group by symbol
setDT(df)[,paste0("symbol_LAG_", 1:5) := shift(value, n = 1:5), by = symbol]
In tidyverse
, one approach is to map
library(dplyr)
library(stringr)
library(purrr)
map_dfc(1:5, ~ df %>%
transmute(!! str_c("S&P500_LAG_", .x) := lag(value, n = .x))) %>%
bind_cols(df, .)
With multiple symbols
map_dfc(1:5, ~ df %>%
group_by(symbol) %>%
transmute(!! str_c("symbol_LAG_", .x) := lag(value, n = .x))) %>%
bind_cols(df, .)
Upvotes: 3