joy_1379
joy_1379

Reputation: 499

Insert rows for missing time in r

For each customer_ID I want to insert their missing time of half hour duration for each date with corresponding duration_1 and duration_2 with 0 values.

date           id        Time              duration_1      duration_2
2021-06-14     1001      10:31-11:00         10             15
2021-06-14     1001      11:01-11:30         NA             28
2021-06-14     1001      11:31-12:00         NA             30
2021-06-14     1001      16:01-16:30         12             20
2021-06-14     1002      14:01-14:30         25             50
2021-06-14     1002      14:31-15:00         NA             60
2021-06-15     1001      18:01-18:30         20             35
2021-06-15     1001      18:31-19:00         28             40 

For e.g. id 1001 has four rows for 14th June with duration_1 and duration_2 for time 10:31-11:00, 11:01-11:30 and so on. Time here is split with half hour duration. I want to insert the remaining time starting from 2:00am to 26.00am for each id each date. Time starts from 2.01 AM and goes upto 26:00 with half hour slot. i.e. after insertion each Id will have 48 records of half hour split for each date as show below

date           id        Time              duration_1      duration_2
2021-06-14     1001      2:01-2:30           0               0
2021-06-14     1001      2:31-3:00           0               0
2021-06-14     1001      3:01-3:30           0               0
2021-06-14     1001      3:31-4:00           0               0
----            --       ----                -               -
----            --       ----                -               -
2021-06-15     1001      24:01-24:30         0               0
2021-06-15     1001      24:31-25:00         0               0
2021-06-15     1001      25:01-25:30         0               0
2021-06-15     1001      25:31-26:00         0               0

I got some clue here: Insert rows for missing weeks in r but not able to replicate the same logic here. Any help is indeed appreciated!

Upvotes: 0

Views: 47

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

You can create a vector that you want to include for every date -

vec <- sprintf('%d:%02d-%d:%02d', rep(2:25, each = 2), c(1, 31), c(2, rep(3:25, each = 2), 26), c(30, 0))
vec

# [1] "2:01-2:30"   "2:31-3:00"   "3:01-3:30"   "3:31-4:00"   "4:01-4:30"  
# [6] "4:31-5:00"   "5:01-5:30"   "5:31-6:00"   "6:01-6:30"   "6:31-7:00"  
#[11] "7:01-7:30"   "7:31-8:00"   "8:01-8:30"   "8:31-9:00"   "9:01-9:30"  
#[16] "9:31-10:00"  "10:01-10:30" "10:31-11:00" "11:01-11:30" "11:31-12:00"
#[21] "12:01-12:30" "12:31-13:00" "13:01-13:30" "13:31-14:00" "14:01-14:30"
#[26] "14:31-15:00" "15:01-15:30" "15:31-16:00" "16:01-16:30" "16:31-17:00"
#[31] "17:01-17:30" "17:31-18:00" "18:01-18:30" "18:31-19:00" "19:01-19:30"
#[36] "19:31-20:00" "20:01-20:30" "20:31-21:00" "21:01-21:30" "21:31-22:00"
#[41] "22:01-22:30" "22:31-23:00" "23:01-23:30" "23:31-24:00" "24:01-24:30"
#[46] "24:31-25:00" "25:01-25:30" "25:31-26:00"

#Convert to factor
vec <- factor(vec, vec)

and then you can use tidyr::complete -

tidyr::complete(df, date, id, Time = vec, 
                fill = list(duration_1 = 0, duration_2 = 0))

#   date          id Time        duration_1 duration_2
#   <chr>      <int> <chr>            <dbl>      <dbl>
# 1 2021-06-14  1001 10:01-10:30          0          0
# 2 2021-06-14  1001 10:31-11:00         10         15
# 3 2021-06-14  1001 11:01-11:30          0         28
# 4 2021-06-14  1001 11:31-12:00          0         30
# 5 2021-06-14  1001 12:01-12:30          0          0
# 6 2021-06-14  1001 12:31-13:00          0          0
# 7 2021-06-14  1001 13:01-13:30          0          0
# 8 2021-06-14  1001 13:31-14:00          0          0
# 9 2021-06-14  1001 14:01-14:30          0          0
#10 2021-06-14  1001 14:31-15:00          0          0
# … with 182 more rows

Upvotes: 1

Related Questions