Reputation: 67
I'm working with hourly polutant data measured in various air quality stations. I want to have the daily average for all stations in one DF.
date | site a | site b | site c |
---|---|---|---|
2010-01-01 00:00:00 | 3 | 2 | 1 |
2010-01-01 01:00:00 | 2 | 2 | 2 |
2010-01-01 02:00:00 | 1 | 2 | 3 |
and I would like:
date | site a | site b | site c |
---|---|---|---|
2010-01-01 | 2 | 2 | 2 |
daylyavg <- df %>%
group_by(Month, Day) %>%
summarize(Avg_a = mean(site a))
Month | day | Avg_a |
---|---|---|
1 | 1 | 2 |
Upvotes: 1
Views: 38
Reputation: 66445
Here's an approach using dplyr::across
:
library(tidyverse); library(lubridate)
my_data %>%
mutate(date = ymd_hms(date)) %>% # in case not already POSIXct date-time
mutate(month = month(date),
day = day(date)) %>%
group_by(month, day) %>%
summarize(across(starts_with("site"), mean,
.names = "Avg_{.col}")) %>%
ungroup()
Result:
month day Avg_site.a Avg_site.b Avg_site.c
<dbl> <int> <dbl> <dbl> <dbl>
1 1 1 2 2 2
Data:
my_data <- data.frame(
date = c("2010-01-01 00:00:00",
"2010-01-01 01:00:00","2010-01-01 02:00:00"),
site.a = c(3L, 2L, 1L),
site.b = c(2L, 2L, 2L),
site.c = c(1L, 2L, 3L)
)
Upvotes: 2