Reputation: 373
I need to fill missing values based on previous and/or forward values, based on grouping. I would like to accomplish this using dplyr (although data.table solutions will also be welcomed).
Sample data:
testing <- tibble(key = c(10,10,10,10,10,10,20,20,20,20,20,20),
year = c(15,15,16,16,17,17,15,15,16,16,17,17),
name = c("abc","abc","","","dfg","dfg",
"","","nmm","nmm","",""),
is_name = c(1,1,0,0,1,1,0,0,0,0,0,0))
key year name is_name
<dbl> <dbl> <chr> <dbl>
1 10 15 abc 1
2 10 15 abc 1
3 10 16 0
4 10 16 0
5 10 17 dfg 1
6 10 17 dfg 1
7 20 15 0
8 20 15 0
9 20 16 nmm 0
10 20 16 nmm 0
11 20 17 0
12 20 17 0
I want to fill missing names (name
) in a way that if the previous year
of the same key
is marked as is_name==1
, than fill the missing with it.
So an output can be:
key year name is_name name_new
<dbl> <dbl> <chr> <dbl> <chr>
1 10 15 abc 1 abc
2 10 15 abc 1 abc
3 10 16 0 abc
4 10 16 0 abc
5 10 17 dfg 1 dfg
6 10 17 dfg 1 dfg
7 20 15 0
8 20 15 0
9 20 16 nmm 0 nmm
10 20 16 nmm 0 nmm
11 20 17 0
12 20 17 0
I tried to use lag
and leap
, but it didn't move beyond the group (key
)properly.
Thanks!
Upvotes: 0
Views: 48
Reputation: 3369
This might work for you
library(dplyr)
library(zoo)
testing <- testing %>%
arrange(key, year) %>%
mutate(name = ifelse(name == "", NA, name),
is_name = ifelse(is_name == 0, NA, is_name)) %>%
group_by(key) %>%
mutate(newname = ifelse((is.na(name) & na.locf(is_name, na.rm = FALSE) == 1), na.locf(name, na.rm = FALSE), name),
is_name = ifelse(is.na(is_name),0,is_name))
Upvotes: 1