Faryan
Faryan

Reputation: 409

Adding value 0 into data using dplyr in R

Here is my data

## Data
datex <- c(rep("2021-01-18", 61), rep("2021-01-19", 125))
hourx <- c(0,1,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,16,10,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,11,0,0,0,0,0,0,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,4,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,10,10,10,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,12,12,12,12,12,12,12,12,13,13,13,13,13,13,13,13,14,14,14,14,14,14,14,14,14,15,15,15,15,16,16,16,16)
transaction <- c(1,6,2,5,1,2,1,9,6,12,5,25,14,6,22,9,10,14,15,12,22,12,12,14,9,11,3,3,4,0,1,4,3,1,2,3,3,5,7,5,5,6,9,16,8,13,10,20,15,18,10,19,15,5,13,12,10,12,26,14,0,4,0,0,0,2,0,0,2,0,4,0,6,8,0,2,3,0,2,0,1,0,1,0,2,0,0,2,1,1,0,0,3,0,1,0,3,0,0,6,5,2,0,8,0,0,12,11,0,2,0,11,0,0,14,21,0,0,13,7,0,17,0,0,18,0,7,0,4,4,0,0,7,12,0,13,0,0,13,6,9,0,0,0,16,0,0,16,0,14,0,0,9,0,11,8,0,8,0,0,8,0,10,5,0,15,0,0,3,0,0,8,8,0,0,6,5,0,8,0,0,5,1,0,0,3)
mydata <- data.frame(datex, hourx, seller, product, detail, status, channel, transaction)

My task is to add 0 into combination. This is what i mean. It's a sample, i wanna find changepoint from

From the result, we can see datex "2021-01-18" and "2021-01-19" miss hourx from 17 to 23, So we need to add 0 into hourx 17-23. i Do it manually like this

How do i add 0 into missing "hourx" automatically using dplyr for all combination? Many Thank You.

Upvotes: 0

Views: 100

Answers (1)

AnilGoyal
AnilGoyal

Reputation: 26238

Do this.

library(dplyr)
library(tidyr)

mydata %>% group_by(datex, seller, product, detail, status, channel) %>%
  complete(hourx = seq(0, 23, 1)) %>%
  mutate(transaction = ifelse(is.na(transaction), 0, transaction))

# A tibble: 408 x 8
# Groups:   datex, seller, product, detail, status, channel [17]
   datex      seller      product  detail status channel hourx transaction
   <chr>      <chr>       <chr>    <chr>  <chr>  <chr>   <dbl>       <dbl>
 1 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          0           0
 2 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          1           0
 3 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          2           0
 4 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          3           5
 5 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          4           2
 6 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          5           9
 7 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          6          12
 8 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          7          25
 9 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          8          22
10 2021-01-18 dombsdpapp1 00021459 E99    FI04   f3          9          10
# ... with 398 more rows

The same operation on cp_data will have exactly desired results

cp_data %>% group_by(datex, seller, product, detail, status, channel) %>%
  complete(hourx = seq(0, 23, 1)) %>%
  mutate(transaction = ifelse(is.na(transaction), 0, transaction))

# A tibble: 48 x 8
# Groups:   datex, seller, product, detail, status, channel [2]
   datex      seller      product  detail status channel hourx transaction
   <chr>      <chr>       <chr>    <chr>  <chr>  <chr>   <dbl>       <dbl>
 1 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          0           1
 2 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          1           6
 3 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          2           2
 4 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          3           1
 5 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          4           1
 6 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          5           6
 7 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          6           5
 8 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          7          14
 9 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          8           6
10 2021-01-18 dombsdpapp1 00021460 E99    FI04   f2          9           9
# ... with 38 more rows

Upvotes: 1

Related Questions