R: Replace NAs for a seq date

I have a df like that:

       Codigo  time       date
1       1001  8.77 2017-01-02
2       1001  8.96 2017-01-03
3       1001  9.56       <NA>
4       1001  7.81 2017-01-05
5       1001  0.00 2017-01-06
6       1001  9.58 2017-01-09
7       1001  9.64 2017-01-10
8       1001 12.11       <NA>
9       1005  6.86       <NA> 
10      1005  6.81 2017-05-04
11      1005  6.83 2017-05-05 
12      1005  6.86 2017-05-08
13      1005  6.90 2017-05-09
14      1005  6.42       <NA>

All the wednesdays are NAs. There are several different code, and each code could have different sequences (i.e.: worker 1001 could have started at day 02/01/2017, and worker 1005 could have started at day 03/05/2017.

I would like to replace this NAs for the logical date.

I was wondering, that the solution may assign the date before minus 1, in the case that the next register has the same code, otherwise it could assign the register before date plus 1. There isn´t any code wit a single register.

Thanks in advance.

Data

df <- data.frame(
      Codigo = c(1001L, 1001L, 1001L, 1001L, 1001L, 1001L, 1001L, 1001L, 1005L,
                 1005L, 1005L, 1005L, 1005L, 1005L),
        time = c(8.77, 8.96, 9.56, 7.81, 0, 9.58, 9.64, 12.11, 6.86, 6.81,
                 6.83, 6.86, 6.9, 6.42),
        date = c("2017-01-02", "2017-01-03", NA, "2017-01-05", "2017-01-06",
                 "2017-01-09", "2017-01-10", NA, NA, "2017-05-04",
                 "2017-05-05", "2017-05-08", "2017-05-09", NA)
)

Upvotes: 2

Views: 116

Answers (1)

dmi3kno
dmi3kno

Reputation: 3055

You could impute missing values by shifting your sequence of observations up and down and inferring the dates from them

df %>% 
  group_by(Codigo) %>% 
  mutate(yesterday=lag(date),
         tomorrow=lead(date),
         date=case_when(
           is.na(date) & is.na(tomorrow) ~ yesterday + lubridate::days(1),
           is.na(date)                   ~ tomorrow - lubridate::days(1),
           TRUE ~ date)) %>%
  select(-yesterday, -tomorrow)

#> # A tibble: 14 x 3
#> # Groups:   Codigo [2]
#>    Codigo  time       date
#>     <int> <dbl>     <date>
#>  1   1001  8.77 2017-01-02
#>  2   1001  8.96 2017-01-03
#>  3   1001  9.56 2017-01-04
#>  4   1001  7.81 2017-01-05
#>  5   1001  0.00 2017-01-06
#>  6   1001  9.58 2017-01-09
#>  7   1001  9.64 2017-01-10
#>  8   1001 12.11 2017-01-11
#>  9   1005  6.86 2017-05-03
#> 10   1005  6.81 2017-05-04
#> 11   1005  6.83 2017-05-05
#> 12   1005  6.86 2017-05-08
#> 13   1005  6.90 2017-05-09
#> 14   1005  6.42 2017-05-10

Upvotes: 3

Related Questions