johnny
johnny

Reputation: 473

Fill values in between values in rows in R based on condition

I have data that look like this:

id <- c(rep(1,5), rep(2,5), rep(3,4), rep(4,2), rep(5, 1))
year <- c(1990,1991,1992,1993,1994,1990,1991,1992,1993,1994,1990,1991,1992,1994,1990,1994, 1994)
gender <- c(rep("female", 5), rep("male", 5), rep("male", 4), rep("female", 2), rep("male", 1))

dat <- data.frame(id,year,gender)

As you can see, id 1 and 2 have observations for every year between 1990 and 1994, while there are missing observations in between 1990 and 1994 for ids 3 and 4, and, finally, only one observation for id 5.

What I want to do is to copy column id and gender and insert the missing observations for id 3 and 4 so that there are observations from 1990 too 1994, while I want to do nothing with id 1, 2 or 5. Is there are way to create a sequence with numbers from the oldest to the newest observation based on the condition that there is a gap between two numbers grouped by a variable, such as id?

The final result should look like this:

     id  year gender
   <dbl> <dbl> <chr> 
 1     1  1990 female
 2     1  1991 female
 3     1  1992 female
 4     1  1993 female
 5     1  1994 female
 6     2  1990 male  
 7     2  1991 male  
 8     2  1992 male  
 9     2  1993 male  
10     2  1994 male  
11     3  1990 male  
12     3  1991 male  
13     3  1992 male  
14     3  1993 male  
15     3  1994 male  
16     4  1990 female
17     4  1991 female
18     4  1992 female
19     4  1993 female
20     4  1994 female
21     5  1994 male

Upvotes: 0

Views: 882

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

Filter the dataset for id 3 and 4, complete their observations and bind the data to other id's where id is not 3 and 4.

library(dplyr)
library(tidyr)

complete_id <- c(3, 4)

dat %>%
  filter(id %in% complete_id) %>%
  complete(id, year = 1990:1994) %>%
  fill(gender) %>%
  bind_rows(dat %>% filter(!id %in% complete_id)) %>%
  arrange(id)

#   id year gender
#1   1 1990 female
#2   1 1991 female
#3   1 1992 female
#4   1 1993 female
#5   1 1994 female
#6   2 1990   male
#7   2 1991   male
#8   2 1992   male
#9   2 1993   male
#10  2 1994   male
#11  3 1990   male
#12  3 1991   male
#13  3 1992   male
#14  3 1993   male
#15  3 1994   male
#16  4 1990 female
#17  4 1991 female
#18  4 1992 female
#19  4 1993 female
#20  4 1994 female
#21  5 1994   male

Upvotes: 1

Related Questions