Scarabee
Scarabee

Reputation: 5714

Create new column containing previous value with same ID

My data looks like this and is already sorted:

> dput(df)
structure(list(id = c(1, 2, 3, 3, 2, 2, 1), x = c(20, 40, 70, 
70, 80, 40, 70)), .Names = c("id", "x"), row.names = c(NA, -7L
), class = "data.frame")

> df
  id  x
1  1 20
2  2 40
3  3 70
4  3 70
5  2 80
6  2 40
7  1 70

I need to create a new column containing the previous value of x with same id. So the result should be:

> df
  id  x old_x
1  1 20    70
2  2 40    80
3  3 70    70
4  3 70    NA
5  2 80    40
6  2 40    NA
7  1 70    NA

I can do it this way:

for (i in 1:nrow(df)){
  id0 = df$id[i]
  j = i + match(id0 , df$id[i+1:nrow(df)])
  df$old_x[i] = df$x[j]
}

but it's far too slow. What would be the best way to do it?

Thanks for your help!

Upvotes: 1

Views: 125

Answers (3)

akrun
akrun

Reputation: 887991

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'id', we assign (:=) the lead values in 'x' to 'old_x'. This should be very fast as we are assigning in place.

library(data.table)
setDT(df)[,old_x := shift(x, type = "lead") , by = id]
df
#   id  x old_x
#1:  1 20    70
#2:  2 40    80
#3:  3 70    70
#4:  3 70    NA
#5:  2 80    40
#6:  2 40    NA
#7:  1 70    NA

Upvotes: 2

lmo
lmo

Reputation: 38520

Here is a base R method with ave and tail:

df$old_x <- ave(df$x, df$id, FUN=function(i) c(tail(i, -1), NA))

df
  id  x old_x
1  1 20    70
2  2 40    80
3  3 70    70
4  3 70    NA
5  2 80    40
6  2 40    NA
7  1 70    NA

Upvotes: 1

Sumedh
Sumedh

Reputation: 4965

Using dplyr:

library(dplyr)
df %>% group_by(id) %>% mutate(old_x = lead(x))

#Source: local data frame [7 x 3]
#Groups: id [3]

#     id     x old_x
#  <dbl> <dbl> <dbl>
#1     1    20    70
#2     2    40    80
#3     3    70    70
#4     3    70    NA
#5     2    80    40
#6     2    40    NA
#7     1    70    NA

Upvotes: 3

Related Questions