data.table way of complete+fill from tidyr with groups of difference length

Question

I have example below. How to make the same things with data.table?

df <- data.frame(person = c(1,2,2),
                 observation_id = c(3,3,5),
                 value = c(1,1,1),
                 ind1 = c(2,4,4),
                 ind2 = c(5,7,7))

df %>% 
  group_by(person) %>% 
  tidyr::complete(observation_id = first(ind1):first(ind2), tidyr::nesting(person)) %>% 
  tidyr::fill(value)

Expected output:

# A tibble: 8 x 5
# Groups:   person [2]
  observation_id person value  ind1  ind2
                
1              2      1    NA    NA    NA
2              3      1     1     2     5
3              4      1     1    NA    NA
4              5      1     1    NA    NA
5              4      2    NA    NA    NA
6              5      2     1     4     7
7              6      2     1    NA    NA
8              7      2     1    NA    NA

Thx for advice!

s_baldur · Accepted Answer

Here is something raw:

DT <- setDT(copy(df))
DT[DT[, .(observation_id = ind1[1]:ind2[1]), by = person], on = .(person, observation_id)
   ][, value := nafill(value, "locf"), by = person][]

#    person observation_id value ind1 ind2
# 1:      1              2    NA   NA   NA
# 2:      1              3     1    2    5
# 3:      1              4     1   NA   NA
# 4:      1              5     1   NA   NA
# 5:      2              4    NA   NA   NA
# 6:      2              5     1    4    7
# 7:      2              6     1   NA   NA
# 8:      2              7     1   NA   NA

Note 1: you (still) need the development version of data.table to have nafill().

Note 2: the final [] is just for printing the results and can be skipped.

data.table way of complete+fill from tidyr with groups of difference length

Answers (2)

Related Questions