FlyingPickle
FlyingPickle

Reputation: 1133

Calculating hourly min/max with ymd_hms data

I would like to calculate the hourly max of several variables in a data frame, where my data look like this:

Date                   Gas     Diesel
2018-01-01 00:00:00     1.5       2.5
2018-01-01 00:01:00     1.7       1.9
2018-01-01 00:02:00     2.2       1.7
---
---
2018-01-01 01:00:00     2.0       1.0
2018-01-01 01:02:00     2.0       1.0

And my expected output is:

 Date                max.Gas     max.Diesel
2018-01-01 00:00:00   2.2          2.5
2018-01-01 01:00:00   2.0          1.0

Any thoughts on how I can achieve this in R, preferably in a tidyverse style?

Upvotes: 0

Views: 96

Answers (1)

DanY
DanY

Reputation: 6073

A dplyr/tidyverse approach:

# load tidyverse packages
library(dplyr)
library(lubridate)

# create example data
mydf <- tibble(
    date   = ymd_hms(c("2018-01-01 00:00:00", "2018-01-01 00:01:00", "2018-01-01 00:02:00", "2018-01-01 01:00:00", "2018-01-01 01:01:00")),
    gas    = c(1.5, 1.7, 2.2, 2.0, 2.0),
    diesel = c(2.5, 1.9, 1.7, 1.0, 1.0)
)

# create a new variable "datehour" and get max(gas) by datehour
mydf %>% 
    mutate(datehour = floor_date(date, unit="hour")) %>% 
    group_by(datehour) %>% 
    summarize(maxgas = max(gas), maxdiesel = max(diesel))

To do this for many columns without typing, e.g., maxgas = max(gas) for each column, use summarize_all:

mydf %>% 
    mutate(datehour = floor_date(date, unit="hour")) %>% 
    select(-date) %>% 
    group_by(datehour) %>% 
    summarize_all(max)

Upvotes: 2

Related Questions