Reputation: 399
I have a dataframe that I am trying to filter on with regards to two columns, a date and a time.
I currently have another function (get_datetime) that takes in a date with format 'YYYYMMDD' and a time 'HHMM' and returns a POSIXct object.
My filter condition is on if the datetime is within a certain number of hours (hour_diff), and this is what I currently have:
rows <- rows[abs(as.numeric(difftime(datetime, get_datetime(rows$file_date, rows$file_time), units='hours'))) <= hour_diff,]
get_datetime <- function(date, time) {
#format date and time into POSIXct object
year <- substr(landfall_date, 1, 4)
month <- substr(landfall_date, 5, 6)
day <- substr(landfall_date, 7, 8)
hour <- substr(landfall_time, 1, nchar(landfall_time) - 2)
minute <- substr(landfall_time, nchar(landfall_time) - 1, nchar(landfall_time))
datetime <- as.POSIXct(paste0(year, '-', month, '-', day, ' ', hour, ':', minute, ':00'))
return(datetime)
}
How do I pass get_datetime an individual date and time as opposed to the entire date and time columns, or is there an alternative way for me to properly filter the rows?
Here is some sample data:
structure(list(county = structure(1:6, .Label = c("beaufort", "bertie", "brunswick", "camden", "carteret", "chowan"), class = "factor"), file_date = c(19900724L, 19900724L, 19900725L, 19900725L, 19900726L, 19900726L), file_time = c(300L, 1200L, 1800L, 1800L, 1200L, 1800L)), class = "data.frame", row.names = c(NA, -6L))
datetime <- as.POSIXct('1990-07-25 12:00')
hour_diff <- 12
With the provided datetime above and 12 hours as the hour_diff, I would want to receive the middle 4 rows (bertie, brunswick, camden, carteret).
Upvotes: 1
Views: 54
Reputation: 750
I recommend the package stringr
and anytime
to clean up your dates and times.
library(anytime)
library(stringr)
library(dplyr)
library(lubridate)
#pad your times that are less than 4 digits
df$file_time = str_pad(df$file_time,width=4,side = "left", pad= "0")
#convert your date and time to datetime
df$new_dt = anytime(paste(df$file_date, df$file_time))
#create an hour flag
df$hour = hour(df$new_dt)
#filter to get your result:
df %>% filter( hour == '12')
county file_date file_time new_dt hour
1 bertie 19900724 1200 1990-07-24 12:00:00 12
2 carteret 19900726 1200 1990-07-26 12:00:00 12
OR if you wanted the range of datetimes between 1990-07-24 12:00:00
and 1990-07-26 12:00:00
df %>% filter(new_dt >= '1990-07-24 12:00:00' & new_dt <= '1990-07-26 12:00:00')
county file_date file_time new_dt hour
1 bertie 19900724 1200 1990-07-24 12:00:00 12
2 brunswick 19900725 1800 1990-07-25 18:00:00 18
3 camden 19900725 1800 1990-07-25 18:00:00 18
4 carteret 19900726 1200 1990-07-26 12:00:00 12
Upvotes: 1