Reputation: 147
I am trying to deal with this problem. I have a df with a date column and I want to count the occurences per hour. Here is what I've done:
x <- df %>%
mutate(hora = hour(date)) %>%
select(hora) %>%
count(hora)
that gives as a result:
> x
# A tibble: 19 x 2
hora n
<int> <int>
1 0 1
2 1 1
3 3 1
4 8 4
5 9 7
6 10 10
7 11 14
8 12 10
9 13 8
10 14 4
11 15 5
12 16 12
13 17 4
14 18 12
15 19 9
16 20 5
17 21 2
18 22 4
19 23 4
As you can see, there are hours that don't show up that would have n=0
, like 2 or 4:7
. What I want is it to add the hours that are not in x
with n=0
so the table is complete.
The expected output should be something like this:
hora n
1 0 12
2 1 3
3 2 5
4 3 7
5 4 8
6 5 1
7 6 0
8 7 11
9 8 6
10 9 10
11 10 9
12 11 0
13 12 0
14 13 3
15 14 0
16 15 7
17 16 8
18 17 1
19 18 2
20 19 11
21 20 6
22 21 10
23 22 9
24 23 4
I tried creating a table with hours 0:23
and all n=0
and trying to sum the two tables but obviously that didn't work. I also tried x$hour <- 0:23
, thinking that the missing values would be added, but it didn't work as well.
Upvotes: 0
Views: 1141
Reputation: 10845
A solution in Base R merges a vector of hours with the summarized data, and sets the missing counts to 0.
textFile <- "row hour count
1 0 1
2 1 1
3 3 1
4 8 4
5 9 7
6 10 10
7 11 14
8 12 10
9 13 8
10 14 4
11 15 5
12 16 12
13 17 4
14 18 12
15 19 9
16 20 5
17 21 2
18 22 4
19 23 4"
data <- read.table(text = textFile,header = TRUE)[-1]
hours <- data.frame(hour = 0:23)
merged <- merge(data,hours,all.y = TRUE)
merged[is.na(merged$count),"count"] <- 0
...and the output:
> head(merged)
hour count
1 0 1
2 1 1
3 2 0
4 3 1
5 4 0
6 5 0
>
Upvotes: 2
Reputation: 388797
You could convert hora
to factor
and use .drop = FALSE
in count
library(dplyr)
library(lubridate)
df %>%
mutate(hora = factor(hour(date), levels = 0:23)) %>%
count(hora, .drop = FALSE)
Another option is to use complete
:
df %>%
mutate(hora = hour(date)) %>%
count(hora) %>%
tidyr::complete(hora = 0:23, fill = list(n = 0))
Upvotes: 4