Reputation: 2077
I wish to have the factor that happened earlier as a new row.
This is my data
df <- data.frame (id =c(1,1,2,2,1), date= c(20161002,20151019, 20160913, 20161117, 20160822), factor = c("A" , "B" ,"C" ,"D" ,"H"))
and I want to have an additional row that shows the immediate last factor. So, my ideal output is:
id date factor col2
1 1 20161002 A H
2 1 20151019 B NA
3 2 20160913 C NA
4 2 20161117 D C
5 1 20160822 H B
For instance, for id 1 in the first row the previous factor happend in 20160822 and its value was H
.
What I tied does not consider the last date
library (dplyr)
library(zoo)
mutate( col2 = na.locf(factor))
Upvotes: 1
Views: 55
Reputation: 388982
We can use dplyr
. Convert the character date
to Date format. Then we sort the date by group (id
) using arrange
and select the last factor
using lag
.
df$date <- as.Date(as.character(df$date), "%Y%m%d")
library(dplyr)
df %>%
group_by(id) %>%
arrange(date) %>%
mutate(col2 = lag(factor))
# id date factor col2
# <dbl> <date> <fctr> <fctr>
#1 1 2015-10-19 B NA
#2 1 2016-08-22 H B
#3 2 2016-09-13 C NA
#4 1 2016-10-02 A H
#5 2 2016-11-17 D C
Upvotes: 2
Reputation: 2806
do this
library(data.table)
df$date = as.Date(as.character(df$date),"%Y%m%d")
setDT(df)
setorder(df,id,date)
df[, "col2" := shift(factor), by = .(id)]
id date factor col2
1: 1 2015-10-19 B NA
2: 1 2016-08-22 H B
3: 1 2016-10-02 A H
4: 2 2016-09-13 C NA
5: 2 2016-11-17 D C
Upvotes: 2