Reputation: 3805
My dataframe consists of monthly weather data as follows for a given location
set.seed(123)
dat <-
data.frame(Year = rep(1980:1985, each = 12),
Month = rep(1:12, times = 6),
value = runif(12*6))
I have split the year into seasons as shown below.
s1 <- c(11, 12, 1, 2) # season 1 consists of month 11, 12, 1 and 2 i.e. cuts across years
s2 <- c(3, 4, 5) # season 2 consists of month 3, 4, 5
s3 <- c(6, 7, 8, 9, 10) # season 3 consists of month 6, 7, 8, 9, 10
Taking example for 1980 - season 1 is Nov-Dec from 1979 and Jan-Feb from 1980 season 2 is from March - May of 1980 season 3 is June - Oct of 1980
However, for year 1980, season 1 is incomplete since it only has months 1 and 2 and missing the months 11 and 12 from 1979. In contrast, for year 1985 season 1 to season 3 is complete and hence I do not need months 11 and 12 from 1985 since it contributes to 1986 season1
With this background, I want to sum monthly values of each season by year
so that the dataframe is in year X season format instead of year-month format
In doing so there will be no values for 1980 season1 since it has missing months.
For cases when months cut across years, I don't know how to sum individual months?
library(dplyr)
season_list <- list(s1, s2, s3)
temp_list <- list()
for(s in seq_along(season_list)){
season_ref <- unlist(season_list[s])
if(sum(diff(season_ref) < 0) != 0){ # check if season cuts across years
dat %>%
dplyr::filter(Month %in% season_ref) %>%
# how do I sum across years for this exception
} else {
# if season does not cut across years, simply filter the months in each year and add
temp_list[[s]] <-
dat %>%
dplyr::filter(Month %in% season_ref) %>%
dplyr::group_by(Year) %>%
dplyr::summarise(season_value = sum(value)) %>%
dplyr::mutate(season = s)
}
}
Upvotes: 0
Views: 110
Reputation: 269586
Assuming that you want to sum the values for each season calculate the Season and endYear (the year that the season ends) and then sum by those.
dat %>%
group_by(endYear = Year + (Month %in% 11:12),
Season = 1 * (Month %in% s1) +
2 * (Month %in% s2) +
3 * (Month %in% s3)) %>%
summarize(value = sum(value), .groups = "drop")
giving:
# A tibble: 19 x 3
endYear Season value
<int> <dbl> <dbl>
1 1980 1 1.08
2 1980 2 2.23
3 1980 3 2.47
4 1981 1 2.66
5 1981 2 1.25
6 1981 3 2.91
7 1982 1 3.00
8 1982 2 1.43
9 1982 3 3.50
10 1983 1 1.48
11 1983 2 0.693
12 1983 3 1.49
13 1984 1 1.82
14 1984 2 1.29
15 1984 3 1.77
16 1985 1 2.03
17 1985 2 1.47
18 1985 3 3.31
19 1986 1 1.38
Upvotes: 0