Reputation: 107
I would like to create a new column that displays the first year in V2 where the value of V3 appears. However, after null values I would like to put again the first year for the reappearance of V3.
That is, in possession of the following data:
I would like to get a new V4 column as follows:
I appreciate any help.
Below are the data:
structure(list(V1 = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3), V2 = c(2005, 2006, 2007, 2008, 2009, 2005, 2006, 2007,
2008, 2009, 2005, 2006, 2007, 2008, 2009), V3 = c(0, 0, 10, 25,
35, 12, 15, 0, 15, 17, 13, 0, 0, 15, 12)), row.names = c(NA,
15L), class = "data.frame")
Upvotes: 1
Views: 230
Reputation: 30494
Using tidyverse
and rleid
from data.table
you can try the following. You can group_by
both V1
as well as a second group based on whether value in V3
is zero. This assumes the years are in chronological order (if not, may need to add arrange
by V2
first).
library(tidyverse)
library(data.table)
df %>%
group_by(V1, grp = rleid(V3 != 0)) %>%
mutate(V4 = ifelse(V3 == 0, 0, first(V2))) %>%
ungroup %>%
select(-grp)
Output
V1 V2 V3 V4
<dbl> <dbl> <dbl> <dbl>
1 1 2005 0 0
2 1 2006 0 0
3 1 2007 10 2007
4 1 2008 25 2007
5 1 2009 35 2007
6 2 2005 12 2005
7 2 2006 15 2005
8 2 2007 0 0
9 2 2008 15 2008
10 2 2009 17 2008
11 3 2005 13 2005
12 3 2006 0 0
13 3 2007 0 0
14 3 2008 15 2008
15 3 2009 12 2008
Upvotes: 2