CForClimate
CForClimate

Reputation: 345

How to find maximum value with date in each month in R?

I am seeking answer to the question that has two parts. In part A: I would like to construct a data frame of the maximum value of total_precip (variable in my data frame) in each month with year, month, and day of its occurrence. In part B: I wanted to have another data frame where I can have the maximum cumulative total_precip for consecutive two day (i.e., sum of total_precip of two consecutive days higher than any other two days) in each month with dates and corresponding values. For example, if the sum of total_precip on Jan 10 and 11 is higher than any other two consecutive day of the month, the dates (year, month and corresponding days) along with its value will be stored in the data frame for each month of the year.

Here is the code that I started to do part A but this gives me only the maximum value in each month without specifying which day that maximum value has occurred.

library(weathercan)
library(tidyverse)
DF = weather_dl(station_ids = 2925, start = "1990-01-01", end = "1995-12-31", interval = "day")[,c(11,12,13,14,32)]
DF$month = as.numeric(DF$month)
DF$day = as.numeric(DF$day)
MaxValWithDate = DF %>% group_by(year, month) %>% summarise(MaxVal = max(total_precip))

Upvotes: 1

Views: 6227

Answers (1)

bouncyball
bouncyball

Reputation: 10761

We can use slice for part A:

DF %>%
    group_by(year, month) %>%
    slice(which.max(total_precip))

 #   date       year  month   day total_precip
 #   <date>     <chr> <dbl> <dbl>        <dbl>
 # 1 1990-01-28 1990      1    28          7.8
 # 2 1990-02-21 1990      2    21          4.8
 # 3 1990-03-12 1990      3    12         49.2
 # ....

And then we can use the lead function along with slice again for part B:

DF %>%
    group_by(year, month) %>%
    mutate(lead_total_precip = lead(total_precip),
           lead_day = lead(date)) %>%
    mutate(cumu_precip = total_precip + lead_total_precip)  %>%
    slice(which.max(cumu_precip))

   # date       year  month   day total_precip lead_total_precip lead_day   cumu_precip
   # <date>     <chr> <dbl> <dbl>        <dbl>             <dbl> <date>           <dbl>
   # 1 1990-01-28 1990      1    28          7.8               5.2 1990-01-29        13  
   # 2 1990-02-21 1990      2    21          4.8               1.8 1990-02-22         6.6
   # 3 1990-03-11 1990      3    11          0                49.2 1990-03-12        49.2
   # ....

The resulting data.frames should have all the information you need, then you can use the select function to keep only the columns you need.

Upvotes: 4

Related Questions