Reputation: 450
I have a data frame with a datetime column. I want to know the number of rows by hour of the day. However, I care only about the rows between 8 AM and 10 PM.
The lubridate
package requires us to filter hours of the day using the 24-hour convention.
library(tidyverse)
library(lubridate)
### Fake Data with Date-time ----
x <- seq.POSIXt(as.POSIXct('1999-01-01'), as.POSIXct('1999-02-01'), length.out=1000)
df <- data.frame(myDateTime = x)
### Get all rows between 8 AM and 10 PM (inclusive)
df %>%
mutate(myHour = hour(myDateTime)) %>%
filter(myHour >= 8, myHour <= 22) %>% ## between 8 AM and 10 PM (both inclusive)
count(myHour) ## number of rows
Is there a way for me to use 10:00 PM
rather than the integer 22
?
Upvotes: 1
Views: 3783
Reputation: 388982
You can also use base R to do this
#Extract the hour
df$hour_day <- as.numeric(format(df$myDateTime, "%H"))
#Subset data between 08:00 AM and 10:00 PM
new_df <- df[df$hour_day >= as.integer(format(as.POSIXct("08:00 AM",
format = "%I:%M %p"), "%H")) & as.integer(format(as.POSIXct("10:00 PM",
format = "%I:%M %p"), "%H")) >= df$hour_day, ]
#Count the frequency
stack(table(new_df$hour_day))
# values ind
#1 42 8
#2 42 9
#3 41 10
#4 42 11
#5 42 12
#6 41 13
#7 42 14
#8 41 15
#9 42 16
#10 42 17
#11 41 18
#12 42 19
#13 42 20
#14 41 21
#15 42 22
This gives the same output as the tidyverse
/lubridate
approach
library(tidyverse)
library(lubridate)
df %>%
mutate(myHour = hour(myDateTime)) %>%
filter(myHour >= hour(ymd_hm("2000-01-01 8:00 AM")),
myHour <= hour(ymd_hm("2000-01-01 10:00 PM"))) %>%
count(myHour)
Upvotes: 2
Reputation: 450
You can use the ymd_hm
and hour
functions to do 12-hour to 24-hour conversions.
df %>%
mutate(myHour = hour(myDateTime)) %>%
filter(myHour >= hour(ymd_hm("2000-01-01 8:00 AM")), ## hour() ignores year, month, date
myHour <= hour(ymd_hm("2000-01-01 10:00 PM"))) %>% ## between 8 AM and 10 PM (both inclusive)
count(myHour)
A more elegant solution.
## custom function to convert 12 hour time to 24 hour time
hourOfDay_12to24 <- function(time12hrFmt){
out <- paste("2000-01-01", time12hrFmt)
out <- hour(ymd_hm(out))
out
}
df %>%
mutate(myHour = hour(myDateTime)) %>%
filter(myHour >= hourOfDay_12to24("8:00 AM"),
myHour <= hourOfDay_12to24("10:00 PM")) %>% ## between 8 AM and 10 PM (both inclusive)
count(myHour)
Upvotes: 3