Nuñes
Nuñes

Reputation: 107

Create a variable based on specific condition from another column based on Year

I would like to create a new column that displays the first year in V2 where the value of V3 appears. However, after null values I would like to put again the first year for the reappearance of V3.

That is, in possession of the following data:

enter image description here

I would like to get a new V4 column as follows:

enter image description here

I appreciate any help.

Below are the data:

structure(list(V1 = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 
3, 3), V2 = c(2005, 2006, 2007, 2008, 2009, 2005, 2006, 2007, 
2008, 2009, 2005, 2006, 2007, 2008, 2009), V3 = c(0, 0, 10, 25, 
35, 12, 15, 0, 15, 17, 13, 0, 0, 15, 12)), row.names = c(NA, 
15L), class = "data.frame")

Upvotes: 1

Views: 230

Answers (1)

Ben
Ben

Reputation: 30494

Using tidyverse and rleid from data.table you can try the following. You can group_by both V1 as well as a second group based on whether value in V3 is zero. This assumes the years are in chronological order (if not, may need to add arrange by V2 first).

library(tidyverse)
library(data.table)

df %>%
  group_by(V1, grp = rleid(V3 != 0)) %>%
  mutate(V4 = ifelse(V3 == 0, 0, first(V2))) %>%
  ungroup %>%
  select(-grp)

Output

      V1    V2    V3    V4
   <dbl> <dbl> <dbl> <dbl>
 1     1  2005     0     0
 2     1  2006     0     0
 3     1  2007    10  2007
 4     1  2008    25  2007
 5     1  2009    35  2007
 6     2  2005    12  2005
 7     2  2006    15  2005
 8     2  2007     0     0
 9     2  2008    15  2008
10     2  2009    17  2008
11     3  2005    13  2005
12     3  2006     0     0
13     3  2007     0     0
14     3  2008    15  2008
15     3  2009    12  2008

Upvotes: 2

Related Questions