joy_1379
joy_1379

Reputation: 499

Insert rows for missing weeks in r

For a large data set I am trying to insert missing weeks for each each customer ID with their corresponding amount to be 0. The week starts from w3 and goes up to w7.

df <- data.frame(cust_id = c("111","111","222","222","222","333","333","333","333"),
             week = c("w4","w5","w4","w5","w6","w3","w4","w5","w6"),
             amount = c(23,34,10,12,35,12,23,15,56))


cust_id week amount
  111   w4     23
  111   w5     34
  222   w4     10
  222   w5     12
  222   w6     35
  333   w3     12
  333   w4     23
  333   w5     15
  333   w6     56

For e.g. cust_id: 111, w3,w6 and w7 is missing. I want to insert those missing weeks and corresponding amount = 0 for cust_id 111 as shown below.

cust_id week amount
  111   w3      0
  111   w4     23
  111   w5     34
  111   w6      0
  111   w7      0

Any help would be great,thanks!

Upvotes: 0

Views: 68

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 388862

Using tidyr::complete :

tidyr::complete(df, cust_id, week = paste0('w', 3:7), fill = list(amount = 0))

#   cust_id week  amount
#   <chr>   <chr>  <dbl>
# 1 111     w3         0
# 2 111     w4        23
# 3 111     w5        34
# 4 111     w6         0
# 5 111     w7         0
# 6 222     w3         0
# 7 222     w4        10
# 8 222     w5        12
# 9 222     w6        35
#10 222     w7         0
#11 333     w3        12
#12 333     w4        23
#13 333     w5        15
#14 333     w6        56
#15 333     w7         0

Upvotes: 2

Sirius
Sirius

Reputation: 5429

here's a way using pivots:


df %>% pivot_wider( names_from=week, values_from=amount ) %>%
    mutate( across(matches("^w\\d"), replace_na, replace=0 ), w7=0 ) %>%
    pivot_longer( cols = matches("^w\\d"), names_to="week", values_to="value" ) %>%
    arrange( cust_id, week )

Output:


   cust_id week  value
   <chr>   <chr> <dbl>
 1 111     w3        0
 2 111     w4       23
 3 111     w5       34
 4 111     w6        0
 5 111     w7        0
 6 222     w3        0
 7 222     w4       10
 8 222     w5       12
 9 222     w6       35
10 222     w7        0
11 333     w3       12
12 333     w4       23
13 333     w5       15
14 333     w6       56
15 333     w7        0

Upvotes: 0

Waldi
Waldi

Reputation: 41220

You could create the missing combinations with expand.grid and then join them :

customers <- unique(df$cust_id)
weeks <- paste0("w",3:7)

weeks <- data.frame(expand.grid(customers,weeks))
colnames(weeks) <- c("cust_id","week")
library(dplyr)
left_join(weeks,df) %>% mutate(across(everything(),replace_na,replace=0))

Joining, by = c("cust_id", "week")
   cust_id week amount
1      111   w3      0
2      222   w3      0
3      333   w3     12
4      111   w4     23
5      222   w4     10
6      333   w4     23
7      111   w5     34
8      222   w5     12
9      333   w5     15
10     111   w6      0
11     222   w6     35
12     333   w6     56
13     111   w7      0
14     222   w7      0
15     333   w7      0

Upvotes: 1

Related Questions