Reputation: 345
I am seeking answer to the question that has two parts. In part A: I would like to construct a data frame of the maximum value of total_precip (variable in my data frame) in each month with year, month, and day of its occurrence. In part B: I wanted to have another data frame where I can have the maximum cumulative total_precip for consecutive two day (i.e., sum of total_precip of two consecutive days higher than any other two days) in each month with dates and corresponding values. For example, if the sum of total_precip on Jan 10 and 11 is higher than any other two consecutive day of the month, the dates (year, month and corresponding days) along with its value will be stored in the data frame for each month of the year.
Here is the code that I started to do part A but this gives me only the maximum value in each month without specifying which day that maximum value has occurred.
library(weathercan)
library(tidyverse)
DF = weather_dl(station_ids = 2925, start = "1990-01-01", end = "1995-12-31", interval = "day")[,c(11,12,13,14,32)]
DF$month = as.numeric(DF$month)
DF$day = as.numeric(DF$day)
MaxValWithDate = DF %>% group_by(year, month) %>% summarise(MaxVal = max(total_precip))
Upvotes: 1
Views: 6227
Reputation: 10761
We can use slice
for part A:
DF %>%
group_by(year, month) %>%
slice(which.max(total_precip))
# date year month day total_precip
# <date> <chr> <dbl> <dbl> <dbl>
# 1 1990-01-28 1990 1 28 7.8
# 2 1990-02-21 1990 2 21 4.8
# 3 1990-03-12 1990 3 12 49.2
# ....
And then we can use the lead
function along with slice
again for part B:
DF %>%
group_by(year, month) %>%
mutate(lead_total_precip = lead(total_precip),
lead_day = lead(date)) %>%
mutate(cumu_precip = total_precip + lead_total_precip) %>%
slice(which.max(cumu_precip))
# date year month day total_precip lead_total_precip lead_day cumu_precip
# <date> <chr> <dbl> <dbl> <dbl> <dbl> <date> <dbl>
# 1 1990-01-28 1990 1 28 7.8 5.2 1990-01-29 13
# 2 1990-02-21 1990 2 21 4.8 1.8 1990-02-22 6.6
# 3 1990-03-11 1990 3 11 0 49.2 1990-03-12 49.2
# ....
The resulting data.frames should have all the information you need, then you can use the select
function to keep only the columns you need.
Upvotes: 4