Reputation: 2251
I have a data frame with the structure below (df
). I want to modify ID2, PERIOD
such that the ID2, PERIOD number is carried forward until AMT > 0
(Note when AMT > 0, the EVID is 0). So, the first two rows of ID2==13/PERIOD==2
should actually be ID2==12/PERIOD==1
as appears in the dfout
data frame. Ans so on.
df <-
ID ID2 TIME DVID AMT DV PERIOD
1 12 0 0 50 NA 1
1 12 0.5 1 0 10 1
1 12 0.5 2 0 15 1
1 13 600 1 0 2.5 2
1 13 600 2 0 4 2
1 13 600 0 100 NA 2
1 13 602 1 0 20 2
1 13 602 2 0 35 2
1 14 800 1 0 5 3
1 14 800 2 0 10 3
1 14 800 0 50 NA 3
dfout <-
ID ID2 TIME DVID AMT DV PERIOD
1 12 0 0 50 NA 1
1 12 0.5 1 0 10 1
1 12 0.5 2 0 15 1
1 12 600 1 0 2.5 1
1 12 600 2 0 4 1
1 13 600 0 100 NA 2
1 13 602 1 0 20 2
1 13 602 2 0 35 2
1 13 800 1 0 5 2
1 13 800 2 0 10 2
1 14 800 0 50 NA 3
Is there a trick on how to do this in R?
Upvotes: 1
Views: 29
Reputation: 887881
We can loop over the columns ("ID2", "PERIOD"), using the cumsum
of the logical vector (DVID==0 & AMT > 0
) to create a grouping index, we change the unique
values.
df[c("ID2", "PERIOD")] <- lapply(df[c("ID2", "PERIOD")], function(x)
unique(x)[with(df, cumsum(DVID==0 & AMT > 0))])
df
# ID ID2 TIME DVID AMT DV PERIOD
#1 1 12 0.0 0 50 NA 1
#2 1 12 0.5 1 0 10.0 1
#3 1 12 0.5 2 0 15.0 1
#4 1 12 600.0 1 0 2.5 1
#5 1 12 600.0 2 0 4.0 1
#6 1 13 600.0 0 100 NA 2
#7 1 13 602.0 1 0 20.0 2
#8 1 13 602.0 2 0 35.0 2
#9 1 13 800.0 1 0 5.0 2
#10 1 13 800.0 2 0 10.0 2
#11 1 14 800.0 0 50 NA 3
Or using dplyr
library(dplyr)
df %>%
mutate_each(funs(unique(.)[cumsum(DVID==0 & AMT > 0)]), DV, PERIOD)
Upvotes: 2