Reputation: 123
I have ten minute data I am trying to aggregate into hourly data. I am using a POSIXct formatted time field. I found a similar question posed here. I would like to group the data by hour, but retain a POSIXct formatted field for each hour for the year ("%Y-%m-%d %H:%M:%S"). In addition, I am hoping to retain all the fields in my original dataframe except what I am passing through to summarize using dplyr. I am encountering problems using group_by_at, which elicits the following error:
Error in as.POSIXlt.default(x, tz = tz(x)) : do not know how to convert 'x' to class “POSIXlt”
When I try to just use group_by, it works, but converts the ymd_hms format into just %h. Is there a way to keep this format intact when grouping by hour? Is there a more efficient way to group the ten minute data into an hourly format?
Here is an example of what works, but removes the ymd_hms format:
summary_df <- long_df %>%
group_by(hour = lubridate::hour(time), discrete_variable) %>%
summarise(max_continuous_variable = max(continuous_variable))
Upvotes: 2
Views: 1621
Reputation: 66870
I'd suggest lubridate::floor_date
for this. It will round down to the last hour, giving you a datetime for grouping.
summary_df <- long_df %>%
group_by(hour = lubridate::floor_date(time, "1 hour"), discrete_variable) %>%
summarise(max_continuous_variable = max(continuous_variable))
Upvotes: 1
Reputation: 123
I asked a colleague, who was able to provide an answer that worked for me. I wanted to share the successful result in case this is relevant to others. Using summarise and applying min(time) after grouping worked to retain the time field that has "%Y-%m-%d %H:%M:%S".
summary_df <- long_df %>%
group_by(hour = lubridate::hour(time), discrete_variable) %>%
summarise(max_continuous_variable = max(continuous_variable),
time = min(time))
Upvotes: 0