Reputation: 409
Here is my data
## Data
datex <- c(rep("2021-01-18", 61), rep("2021-01-19", 125))
hourx <- c(0,1,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,16,10,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,11,0,0,0,0,0,0,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,4,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,10,10,10,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,12,12,12,12,12,12,12,12,13,13,13,13,13,13,13,13,14,14,14,14,14,14,14,14,14,15,15,15,15,16,16,16,16)
transaction <- c(1,6,2,5,1,2,1,9,6,12,5,25,14,6,22,9,10,14,15,12,22,12,12,14,9,11,3,3,4,0,1,4,3,1,2,3,3,5,7,5,5,6,9,16,8,13,10,20,15,18,10,19,15,5,13,12,10,12,26,14,0,4,0,0,0,2,0,0,2,0,4,0,6,8,0,2,3,0,2,0,1,0,1,0,2,0,0,2,1,1,0,0,3,0,1,0,3,0,0,6,5,2,0,8,0,0,12,11,0,2,0,11,0,0,14,21,0,0,13,7,0,17,0,0,18,0,7,0,4,4,0,0,7,12,0,13,0,0,13,6,9,0,0,0,16,0,0,16,0,14,0,0,9,0,11,8,0,8,0,0,8,0,10,5,0,15,0,0,3,0,0,8,8,0,0,6,5,0,8,0,0,5,1,0,0,3)
mydata <- data.frame(datex, hourx, seller, product, detail, status, channel, transaction)
My task is to add 0 into combination. This is what i mean. It's a sample, i wanna find changepoint from
From the result, we can see datex "2021-01-18" and "2021-01-19" miss hourx from 17 to 23, So we need to add 0 into hourx 17-23. i Do it manually like this
How do i add 0 into missing "hourx" automatically using dplyr for all combination? Many Thank You.
Upvotes: 0
Views: 100
Reputation: 26238
Do this.
library(dplyr)
library(tidyr)
mydata %>% group_by(datex, seller, product, detail, status, channel) %>%
complete(hourx = seq(0, 23, 1)) %>%
mutate(transaction = ifelse(is.na(transaction), 0, transaction))
# A tibble: 408 x 8
# Groups: datex, seller, product, detail, status, channel [17]
datex seller product detail status channel hourx transaction
<chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
1 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 0 0
2 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 1 0
3 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 2 0
4 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 3 5
5 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 4 2
6 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 5 9
7 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 6 12
8 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 7 25
9 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 8 22
10 2021-01-18 dombsdpapp1 00021459 E99 FI04 f3 9 10
# ... with 398 more rows
The same operation on cp_data will have exactly desired results
cp_data %>% group_by(datex, seller, product, detail, status, channel) %>%
complete(hourx = seq(0, 23, 1)) %>%
mutate(transaction = ifelse(is.na(transaction), 0, transaction))
# A tibble: 48 x 8
# Groups: datex, seller, product, detail, status, channel [2]
datex seller product detail status channel hourx transaction
<chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
1 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 0 1
2 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 1 6
3 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 2 2
4 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 3 1
5 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 4 1
6 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 5 6
7 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 6 5
8 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 7 14
9 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 8 6
10 2021-01-18 dombsdpapp1 00021460 E99 FI04 f2 9 9
# ... with 38 more rows
Upvotes: 1