Reputation: 143
I have a datetime column that I'd like to filter down to only contain 15 minute intervals with the first row's datetime being the baseline. Observe:
order_dates
1 2022-08-14 12:15:10
2 2022-08-14 12:15:11
3 2022-08-14 12:15:13
4 2022-08-14 12:20:10
5 2022-08-14 12:20:16
6 2022-08-14 12:20:14
7 2022-08-14 12:25:19
8 2022-08-14 12:25:12
9 2022-08-14 12:25:20
10 2022-08-14 12:30:23
11 2022-08-14 12:30:31
12 2022-08-14 12:30:34
13 2022-08-14 12:40:32
14 2022-08-14 12:40:52
15 2022-08-14 12:40:51
16 2022-08-14 12:45:40
17 2022-08-14 12:45:45
18 2022-08-14 12:45:23
I would like the final dataset to only contain rows that contain 15 minute time-intervals with 12:15 as the baseline. So, I would like my final dataset to only contain rows 1,2,3,10,11,12,16,17,18. I would also like the final times to have the seconds column removed as well. Also, the order_dates column is of the POSIXct class
Upvotes: 1
Views: 382
Reputation: 173793
To allow this to work for any initial time, and to produce the formatting you desire, you could do:
library(lubridate)
library(tidyverse)
df %>%
mutate(order_dates = floor_date(order_dates, unit = 'min'),
steps = as.numeric(order_dates - order_dates[1])/60) %>%
filter(steps %% 15 == 0) %>%
select(-steps) %>%
mutate(order_dates = substr(order_dates, 1, 16))
#> order_dates
#> 1 2022-08-14 12:15
#> 2 2022-08-14 12:15
#> 3 2022-08-14 12:15
#> 4 2022-08-14 12:30
#> 5 2022-08-14 12:30
#> 6 2022-08-14 12:30
#> 7 2022-08-14 12:45
#> 8 2022-08-14 12:45
#> 9 2022-08-14 12:45
Upvotes: 2
Reputation: 7385
Using dplyr
and lubridate
:
library(dplyr)
library(lubridate)
df %>%
mutate(order_dates = format(order_dates ,format = "%Y-%m-%d %H:%M"),
mins = lubridate::minute(order_dates)) %>%
filter(mins %in% c(15, 30, 45)) %>%
select(-mins)
This gives:
# A tibble: 9 × 1
order_dates
<chr>
1 2022-08-14 12:15
2 2022-08-14 12:15
3 2022-08-14 12:15
4 2022-08-14 12:30
5 2022-08-14 12:30
6 2022-08-14 12:30
7 2022-08-14 12:45
8 2022-08-14 12:45
9 2022-08-14 12:45
Data:
structure(list(order_dates = structure(c(1660479310, 1660479311,
1660479313, 1660479610, 1660479616, 1660479614, 1660479919, 1660479912,
1660479920, 1660480223, 1660480231, 1660480234, 1660480832, 1660480852,
1660480851, 1660481140, 1660481145, 1660481123), class = c("POSIXct",
"POSIXt"), tzone = "UTC")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -18L))
Upvotes: 2
Reputation: 886938
Perhaps we extract the minutes part and create a logical expression
library(dplyr)
library(lubridate)
df1 %>%
mutate(order_dates = ymd_hms(order_dates)) %>%
filter(! minute(order_dates)%%15 )
-output
order_dates
1 2022-08-14 12:15:10
2 2022-08-14 12:15:11
3 2022-08-14 12:15:13
10 2022-08-14 12:30:23
11 2022-08-14 12:30:31
12 2022-08-14 12:30:34
16 2022-08-14 12:45:40
17 2022-08-14 12:45:45
18 2022-08-14 12:45:23
df1 <- structure(list(order_dates = c("2022-08-14 12:15:10", "2022-08-14 12:15:11",
"2022-08-14 12:15:13", "2022-08-14 12:20:10", "2022-08-14 12:20:16",
"2022-08-14 12:20:14", "2022-08-14 12:25:19", "2022-08-14 12:25:12",
"2022-08-14 12:25:20", "2022-08-14 12:30:23", "2022-08-14 12:30:31",
"2022-08-14 12:30:34", "2022-08-14 12:40:32", "2022-08-14 12:40:52",
"2022-08-14 12:40:51", "2022-08-14 12:45:40", "2022-08-14 12:45:45",
"2022-08-14 12:45:23")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18"))
Upvotes: 3