QFi
QFi

Reputation: 291

How to lag multiple specific columns of a data frame in R

I would like to lag multiple specific columns of a data frame in R.

Let's take this generic example. Let's assume I have defined which columns of my dataframe I need to lag:

Lag <- c(0, 1, 0, 1)
Lag.Index <- is.element(Lag, 1)
df <- data.frame(x1 = 1:8, x2 = 1:8, x3 = 1:8, x4 = 1:8)

My initial dataframe:

        x1  x2  x3  x4   
    1   1   1   1   1
    2   2   2   2   2
    3   3   3   3   3
    4   4   4   4   4 
    5   5   5   5   5
    6   6   6   6   6
    7   7   7   7   7
    8   8   8   8   8 

I would like to compute the following dataframe:

        x1  x2  x3  x4   
    1   1   NA  1   NA
    2   2   2   2   2
    3   3   3   3   3
    4   4   4   4   4 
    5   5   5   5   5
    6   6   6   6   6
    7   7   7   7   7
    8   8   8   8   8 

I would know how to do it for only one lagged column as shown here, but not able to find a way to do it for multiple lagged columns in an elegant way. Any help is very much appreciated.

Upvotes: 2

Views: 1799

Answers (4)

ThomasIsCoding
ThomasIsCoding

Reputation: 101179

A data.table option using shift along with Vectorize

> setDT(df)[, Vectorize(shift)(.SD, Lag)]
     x1 x2 x3 x4
[1,]  1 NA  1 NA
[2,]  2  1  2  1
[3,]  3  2  3  2
[4,]  4  3  4  3
[5,]  5  4  5  4
[6,]  6  5  6  5
[7,]  7  6  7  6
[8,]  8  7  8  7

Upvotes: 2

akrun
akrun

Reputation: 886998

We convert the lag to logical class, get the corresponding names and use across from dplyr

library(dplyr)
df %>% 
      mutate(across(names(.)[as.logical(Lag)], lag))
#  x1 x2 x3 x4
#1  1 NA  1 NA
#2  2  1  2  1
#3  3  2  3  2
#4  4  3  4  3
#5  5  4  5  4
#6  6  5  6  5
#7  7  6  7  6
#8  8  7  8  7

Or we can do this in base R

df[as.logical(Lag)] <- rbind(NA, df[-nrow(df), as.logical(Lag)])

Upvotes: 1

sa90210
sa90210

Reputation: 585

Not sure whether this is elegant enough, but I would use dplyr's mutate_at function to tweak columns

df %>% dplyr::mutate_at(.vars = vars(x2,x4),.funs = ~lag(., default = NA))

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388862

You can use purrr's map2_dfc to lag different values by column.

purrr::map2_dfc(df, Lag, dplyr::lag)

#     x1    x2    x3    x4
#  <int> <int> <int> <int>
#1     1    NA     1    NA
#2     2     1     2     1
#3     3     2     3     2
#4     4     3     4     3
#5     5     4     5     4
#6     6     5     6     5
#7     7     6     7     6
#8     8     7     8     7

Or with data.table :

library(data.table)
setDT(df)[, names(df) := Map(shift, .SD, Lag)]

Upvotes: 5

Related Questions