Reputation: 518
REGISTRATION_TIME
08:53:16
13:18:57
15:57:58
09:25:47
13:35:54
12:01:31
09:37:57
12:44:47
21:26:12
21:26:12
14:56:13
02:09:31
15:28:51
15:30:57
I am trying to plot the time on the X-axis and find the count for each time. This is a sample dataset of 5,000 rows. And would also like to create bins for every hour.
I tried the following:
TIME_plot <- ggplot(LW_Par, aes(REGISTRATION_TIME)) + geom_bar(colour = "white", fill = "#1380A1")
Having some trouble figuring out how to code this and any help would be much appreciated.
Time_Plot <- LW_Par %>%
mutate(REGISTRATION_TIME = hms(REGISTRATION_TIME)) %>%
ggplot(aes(x = REGISTRATION_TIME)) +
geom_histogram(bins = 24, colour = "white", fill = "#1380A1") +
scale_x_time() + bbc_style()
Time_Plot
So using the solution provided by H 1 (thank you), how would I go about expanding on the x-axis breaks to give more insight on where the counts are?
Also is there a way to use "summary" on the time data to find average or mode of dataset?
Upvotes: 0
Views: 1305
Reputation: 34291
One approach would be to use geom_histogram
to easily bin the data:
library(dplyr)
library(ggplot2)
library(lubridate)
dat %>%
mutate(REGISTRATION_TIME = hms(REGISTRATION_TIME)) %>%
ggplot(aes(x = REGISTRATION_TIME)) +
geom_histogram(bins = 24) +
scale_x_time()
Edit:
You can use the breaks
argument in the scale
command to set the number of x-axis labels. You can also achieve finer control of the bins by using the binwidth
argument in geom_histogram
. As you have a time variable, the unit represents seconds so you can bin by 15 mins, for example, by `binwidth = 900'.
dat %>%
mutate(REGISTRATION_TIME = hms(REGISTRATION_TIME)) %>%
ggplot(aes(x = REGISTRATION_TIME)) +
geom_histogram(binwidth = 900) +
scale_x_time(breaks = hm(paste0(seq(0, 24, by = 3), ":00")))
Data:
dat <- read.table(text = "REGISTRATION_TIME
08:53:16
13:18:57
15:57:58
09:25:47
13:35:54
12:01:31
09:37:57
12:44:47
21:26:12
21:26:12
14:56:13
02:09:31
15:28:51
15:30:57", header = TRUE)
Upvotes: 2
Reputation: 66415
Minor variation of @H1's solution. You could do the time bucketing before ggplot:
library(dplyr); library(lubridate)
dat %>%
mutate(REG_TIME_HOUR = hour(hms(REGISTRATION_TIME))) %>%
count(REG_TIME_HOUR) %>%
ggplot(aes(REG_TIME_HOUR, n)) + geom_col()
Upvotes: 0