R : How to filter subset min of minutes in every hour?

Question

I'm still new to R. I can query selecting a first row of every hour by partition in SQL server management. I found a very big problem from my program it takes too long time with sqlQuery() function in R and more long on running my shiny web app so much. The one tested way to less take time is query of all data and prepare the data in R much faster.

So here is my example data :

                    Date Val
1    2017-06-11 04:00:02  83
2    2017-06-11 04:01:02  77
3    2017-06-11 04:03:01  78
..
4    2017-05-11 05:00:03  83
5    2017-05-11 05:01:02  95
6    2017-05-11 05:02:02  10
..
7    2017-05-11 06:01:00  29
8    2017-05-11 06:02:02  39
9    2017-05-11 06:03:03  85
10   2017-05-11 06:04:02  71
..
11   2017-05-11 07:05:02  33
12   2017-05-11 07:06:02  14
.. (many rows)
13   2017-05-12 00:01:01  20
14   2017-05-12 00:03:01  45

Sometimes there are missing rows during an hour. My problem is how to filter only a min minute row of every hour.

My example output :

                    Date Val
1    2017-05-11 04:00:02  83
4    2017-05-11 05:00:03  83
7    2017-05-11 06:01:00  29
11   2017-05-11 07:05:02  33
13   2017-05-12 00:01:01  20

Any idea please help. Thank you so much.

neilfws · Accepted Answer

You can use lubridate and dplyr to convert the dates, extract the date and hour, then filter for the minimum time within an hour by date. I changed your June 31 to June 30 :)

library(dplyr)
library(lubridate)

dataset %>% 
  mutate(Date = ymd_hms(Date), dt = as_date(Date), hr = hour(Date)) %>% 
  group_by(dt, hr) %>% 
  filter(Date == min(Date)) %>% 
  ungroup() %>% 
  select(Date, Val)

                 Date   Val
                
1 2017-06-30 04:00:02    83
2 2017-05-11 05:00:03    83
3 2017-05-11 06:01:00    29
4 2017-05-11 07:05:02    33
5 2017-05-12 00:01:01    20

R : How to filter subset min of minutes in every hour?

Answers (2)

Related Questions