Slyrs
Slyrs

Reputation: 131

Using Diff() in R for multiple columns

I would like to calculate the first order difference for many columns in a data frame without naming them explicitly. It works well with one column with this code:

set.seed(1)
Data <- data.frame(
  X = sample(1:10),
  Y = sample(1:10),
  Z = sample(1:10))
 Newdata <- as.data.frame(diff(Data$X, lag = 1))

How to I calculate the same for a lot of columns, e.g.[2:200], in a data frame?

Upvotes: 3

Views: 5870

Answers (1)

BrodieG
BrodieG

Reputation: 52697

I think this does what you want:

as.data.frame(lapply(Data, diff, lag=1))
##    X  Y  Z
## 1  1 -1 -8
## 2  1  4  4
## 3  2  4 -5
## 4 -5 -5  8
## 5  6  2 -1
## 6  1  1 -1
## 7 -3 -4 -2
## 8  4 -3 -2
## 9 -9  8  1

Since data frames are internally lists, we can lapply over the columns. You can use Data[1:2] instead of Data to just do the first two columns, or any valid column indexing.

Upvotes: 6

Related Questions