Koolakuf_DR
Koolakuf_DR

Reputation: 518

Plotting Time and Getting the Count in R

REGISTRATION_TIME
08:53:16
13:18:57
15:57:58
09:25:47
13:35:54
12:01:31
09:37:57
12:44:47
21:26:12
21:26:12
14:56:13
02:09:31
15:28:51
15:30:57

I am trying to plot the time on the X-axis and find the count for each time. This is a sample dataset of 5,000 rows. And would also like to create bins for every hour.

I tried the following:

TIME_plot <- ggplot(LW_Par, aes(REGISTRATION_TIME)) + geom_bar(colour = "white", fill = "#1380A1")

Having some trouble figuring out how to code this and any help would be much appreciated.


Time_Plot <- LW_Par %>%
  mutate(REGISTRATION_TIME = hms(REGISTRATION_TIME)) %>% 
  ggplot(aes(x = REGISTRATION_TIME)) +
  geom_histogram(bins = 24, colour = "white", fill = "#1380A1") + 
  scale_x_time() + bbc_style()
Time_Plot

So using the solution provided by H 1 (thank you), how would I go about expanding on the x-axis breaks to give more insight on where the counts are?

Also is there a way to use "summary" on the time data to find average or mode of dataset?

enter image description here

Upvotes: 0

Views: 1305

Answers (2)

lroha
lroha

Reputation: 34291

One approach would be to use geom_histogram to easily bin the data:

library(dplyr)
library(ggplot2)
library(lubridate)

dat %>%
  mutate(REGISTRATION_TIME = hms(REGISTRATION_TIME)) %>% 
  ggplot(aes(x = REGISTRATION_TIME)) +
  geom_histogram(bins = 24) +
  scale_x_time()

enter image description here

Edit:

You can use the breaks argument in the scale command to set the number of x-axis labels. You can also achieve finer control of the bins by using the binwidth argument in geom_histogram. As you have a time variable, the unit represents seconds so you can bin by 15 mins, for example, by `binwidth = 900'.

dat %>%
  mutate(REGISTRATION_TIME = hms(REGISTRATION_TIME)) %>% 
  ggplot(aes(x = REGISTRATION_TIME)) +
  geom_histogram(binwidth = 900) +
  scale_x_time(breaks = hm(paste0(seq(0, 24, by = 3), ":00")))

Data:

dat <- read.table(text = "REGISTRATION_TIME
08:53:16
13:18:57
15:57:58
09:25:47
13:35:54
12:01:31
09:37:57
12:44:47
21:26:12
21:26:12
14:56:13
02:09:31
15:28:51
15:30:57", header = TRUE)

Upvotes: 2

Jon Spring
Jon Spring

Reputation: 66415

Minor variation of @H1's solution. You could do the time bucketing before ggplot:

library(dplyr); library(lubridate)
dat %>%
  mutate(REG_TIME_HOUR = hour(hms(REGISTRATION_TIME))) %>%
  count(REG_TIME_HOUR) %>%
  ggplot(aes(REG_TIME_HOUR, n)) + geom_col()

enter image description here

Upvotes: 0

Related Questions