Is there a way to fill in missing dates with 0s using dplyr?

Question

I have a dataset like this:

id  date     value      
1   8/06/12    1         
1   8/08/12    1         
2   8/07/12    2         
2   8/08/12    1

Every id should a have a value for every date. When an id is missing a particular date, that row needs to be added with a value of 0. E.g.,

id  date     value      
1   8/06/12    1   
1   8/07/12    0      
1   8/08/12    1  
2   8/06/12    0         
2   8/07/12    2         
2   8/08/12    1

I'm trying to figure out how to add the rows with 0s. There's a good solution here: R - Fill missing dates by group. However, I can't use the tidyr::complete function because I'm using sparklyr and, as far as I know, need to stay within dplyr functions.

steveo&#39;america · Accepted Answer

In sparklyr, you must use Spark functions. This is a job for coalesce. First you have to fill out all the pairs of ids and dates you expect to see, so maybe something like: (edit)

all_id <- old_data %>% distinct(id) %>% mutate(common=0)
all_date <- old_data %>% distinct(date) %>% mutate(common=0)
all_both <- all_id %>% full_join(all_date,by='common')
data <- old_data %>%
  right_join(all_both %>% select(-common),by=c('id','date')) %>%
  mutate(value=`coalesce(value,0)`)

I have assumed you have all the dates and ids you care about in your old data, though that might not be the case.

Is there a way to fill in missing dates with 0s using dplyr?

Answers (2)

Related Questions