Reputation: 5714
My data looks like this and is already sorted:
> dput(df)
structure(list(id = c(1, 2, 3, 3, 2, 2, 1), x = c(20, 40, 70,
70, 80, 40, 70)), .Names = c("id", "x"), row.names = c(NA, -7L
), class = "data.frame")
> df
id x
1 1 20
2 2 40
3 3 70
4 3 70
5 2 80
6 2 40
7 1 70
I need to create a new column containing the previous value of x
with same id
. So the result should be:
> df
id x old_x
1 1 20 70
2 2 40 80
3 3 70 70
4 3 70 NA
5 2 80 40
6 2 40 NA
7 1 70 NA
I can do it this way:
for (i in 1:nrow(df)){
id0 = df$id[i]
j = i + match(id0 , df$id[i+1:nrow(df)])
df$old_x[i] = df$x[j]
}
but it's far too slow. What would be the best way to do it?
Thanks for your help!
Upvotes: 1
Views: 125
Reputation: 887991
We can use data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'id', we assign (:=
) the lead
values in 'x' to 'old_x'. This should be very fast as we are assigning in place.
library(data.table)
setDT(df)[,old_x := shift(x, type = "lead") , by = id]
df
# id x old_x
#1: 1 20 70
#2: 2 40 80
#3: 3 70 70
#4: 3 70 NA
#5: 2 80 40
#6: 2 40 NA
#7: 1 70 NA
Upvotes: 2
Reputation: 38520
Here is a base R method with ave
and tail
:
df$old_x <- ave(df$x, df$id, FUN=function(i) c(tail(i, -1), NA))
df
id x old_x
1 1 20 70
2 2 40 80
3 3 70 70
4 3 70 NA
5 2 80 40
6 2 40 NA
7 1 70 NA
Upvotes: 1
Reputation: 4965
Using dplyr
:
library(dplyr)
df %>% group_by(id) %>% mutate(old_x = lead(x))
#Source: local data frame [7 x 3]
#Groups: id [3]
# id x old_x
# <dbl> <dbl> <dbl>
#1 1 20 70
#2 2 40 80
#3 3 70 70
#4 3 70 NA
#5 2 80 40
#6 2 40 NA
#7 1 70 NA
Upvotes: 3