Reputation: 463
Here I have some air pollution concentration data with number concentration per minute collected in London. This pollutant is a traffic-related air pollutant. I would like to add another column with a factor variable (Peak; Off-Peak) to distinguish the data was collected during Peak time (6:30 am-9:30 am) or Off-Peak time (16:00 to 19:00) only for Monday to Friday. The current time column is in POSIXct format with date+time.
QUESTION 1: Should I do the following: 1.remove Saturday and Sunday data 2.extract time from the date+time 3.identify time range of (6:30 am-9:30 am) and (16:00 to 19:00)
OR there is a way to identify time range of a day using date+time (POSIXct)
QUESTION 2:How can I properly extract time from date+time variable of POSIXct format, which can be used for time range ((6:30 am-9:30 am) and (16:00 to 19:00)) identification?
Upvotes: 0
Views: 65
Reputation: 18708
You can use the chron package as it has a function for determining weekdays. It also has a function for holidays. And the lubridate package can be used to extract the hour and minute from the date/time variable.
library(chron)
library(lubridate)
data %>%
mutate(peak = case_when(
is.weekend(as.Date(date)) ~ FALSE,
(hour(date)==6 & minute(date)>=30 | hour(date)>6) &
(hour(date)==9 & minute(date)<=30 | hour(date)<9) ~ TRUE,
TRUE ~ FALSE),
offpeak = case_when(
is.weekend(as.Date(date)) ~ FALSE,
(hour(date)>=16 & hour(date)<=19) ~ TRUE,
TRUE ~ FALSE)
)
date peak offpeak
1 1998-01-01 06:00:00 FALSE FALSE
2 1998-01-01 06:20:00 FALSE FALSE
3 1998-01-01 06:40:00 TRUE FALSE
4 1998-01-01 07:00:00 TRUE FALSE
5 1998-01-01 07:20:00 TRUE FALSE
6 1998-01-01 07:40:00 TRUE FALSE
7 1998-01-01 08:00:00 TRUE FALSE
8 1998-01-01 08:20:00 TRUE FALSE
9 1998-01-01 08:40:00 TRUE FALSE
10 1998-01-01 09:00:00 TRUE FALSE
11 1998-01-01 09:20:00 TRUE FALSE
12 1998-01-01 09:40:00 FALSE FALSE
13 1998-01-01 10:00:00 FALSE FALSE
30 1998-01-01 15:40:00 FALSE FALSE
31 1998-01-01 16:00:00 FALSE TRUE
32 1998-01-01 16:20:00 FALSE TRUE
33 1998-01-01 16:40:00 FALSE TRUE
34 1998-01-01 17:00:00 FALSE TRUE
35 1998-01-01 17:20:00 FALSE TRUE
36 1998-01-01 17:40:00 FALSE TRUE
37 1998-01-01 18:00:00 FALSE TRUE
38 1998-01-01 18:20:00 FALSE TRUE
39 1998-01-01 18:40:00 FALSE TRUE
40 1998-01-01 19:00:00 FALSE TRUE
41 1998-01-01 19:20:00 FALSE TRUE
42 1998-01-01 19:40:00 FALSE TRUE
43 1998-01-01 20:00:00 FALSE FALSE
Data:
data <- data.frame(date=seq.POSIXt(as.POSIXct("1998-01-01 06:00:00"),
as.POSIXct("1998-01-02 06:00:00"), by="20 min"))
Upvotes: 0