user10090587
user10090587

Reputation: 11

R, select rainfall events and calculate rainfall event total from time-series data

Here is what I am trying to make the code do:

-identify unique rainfall "events" in the dataset. I want to start with an inter event period of 6 dry hours between events.

-My plan of attack was to create a column that would contain a unique "flags" for the events. The event flag or ID could be the start timedate stamp of the event or just a n+1 the last identifier (1,1,1,1,2,2,2,2) etc. I'm having trouble to get this unique flag part, because I need R to "look ahead" in the precip column to see if it rains within 6 hours in the future. Then if it does, it should create a flag.

-Finally, I'd like to get an output (similar to a pivot table) that sums the total precip in inches of each unique event, and also gives me the start and stop time, and total duration of event.

EXAMPLE OUTPUT

Event ID Precip (in) Event STart Event Stop Time (hours)

1 0.07 10/6/2017 17:00 10/6/2017 22:00 6:00

2 0.01 10/7/2017 15:00 10/7/2017 15:00 1:00

3 0.15 10/10/2017 11:00 10/10/2017 13:00 3:00

CODE
library(zoo) # to get rollsum fxn

DF1 <- read.csv("U:/R_files/EOF_Rainfall_Stats_2017- 
18/Precip_DF1_Oct17toMay18.csv")

DF1$event <- NA

DF1$event[DF1$Precip_in > 0] = "1"
DF1$event[DF1$Precip_in == 0] = "0"
str(DF1)
DF1$event <- as.numeric(DF1$event)
str(DF1)


DF1$rollsum6 <- round(rollsum(DF1$event, k=6, fill=NA, align="right"),5)


DF1$eventID <- NA
DF1$eventID <- ifelse(DF1$rollsum6 >= 2 & DF1$event == 1, "flag", "NA") 

RAW DATA

DateTime Precip_in

10/6/2017 13:00 0

10/6/2017 14:00 0

10/6/2017 15:00 0

10/6/2017 16:00 0

10/6/2017 17:00 0.04

10/6/2017 18:00 0

10/6/2017 19:00 0

10/6/2017 20:00 0

10/6/2017 21:00 0.01

10/6/2017 22:00 0.02

10/6/2017 23:00 0

10/7/2017 0:00 0

10/7/2017 1:00 0

10/7/2017 2:00 0

10/7/2017 3:00 0

10/7/2017 4:00 0

10/7/2017 5:00 0

10/7/2017 6:00 0

10/7/2017 7:00 0

10/7/2017 8:00 0

10/7/2017 9:00 0

10/7/2017 10:00 0

10/7/2017 11:00 0

10/7/2017 12:00 0

10/7/2017 13:00 0

10/7/2017 14:00 0

10/7/2017 15:00 0.01

Upvotes: 1

Views: 1603

Answers (1)

loreabad6
loreabad6

Reputation: 315

If someone is still looking for a way to solve this question, here is my 'tidy' approach on it. I saved the data in a variable called data.

library(dplyr)

# Set data column as POSIXct, important for calculating duration afterwards
data <- data %>% mutate(DateTime = as.POSIXct(DateTime, format = '%m/%d/%Y %H:%M'))

flags <- data %>% 
  # Set a rain flag if there is rain registered on the gauge
  mutate(rainflag = ifelse(Precip_in > 0, 1, 0)) %>% 
  # Create a column that contains the number of consecutive times there was rain or not.
  # Use `rle`` which indicates how many times consecutive values happen, and `rep`` to repeat it for each row.
  mutate(rainlength = rep(rle(rainflag)$lengths, rle(rainflag)$lengths)) %>% 
  # Set a flag for an event happening, when there is rain there is a rain event, 
  # when it is 0 but not for six consecutive times, it is still a rain event
  mutate(
    eventflag = ifelse(
      rainflag == 1, 
      1, 
      ifelse(
        rainflag == 0 & rainlength < 6, 
        1, 
        0
      )
    )
  ) %>% 
  # Correct for the case when the dataset starts with no rain for less than six consecutive times
  # If within the first six rows there is no rain registered, then the event flag should change to 0
  mutate(eventflag = ifelse(row_number() < 6 & rainflag == 0, 0, eventflag)) %>% 
  # Add an id to each event (rain or not), to group by on the pivot table
  mutate(eventid = rep(seq(1,length(rle(eventflag)$lengths)), rle(eventflag)$lengths))

rain_pivot <- flags %>% 
  # Select only the rain events
  filter(eventflag == 1) %>% 
  # Group by id
  group_by(eventid) %>% 
  summarize(
    precipitation = sum(Precip_in),
    eventStart = first(DateTime),
    eventEnd = last(DateTime)
  ) %>% 
  # Compute time difference as duration of event, add 1 hour, knowing that the timestamp is the time when the rain record ends
  mutate(time = as.numeric(difftime(eventEnd,eventStart, units = 'h')) + 1)

rain_pivot
#> # A tibble: 2 x 5
#>   eventid precipitation eventStart          eventEnd             time
#>     <int>         <dbl> <dttm>              <dttm>              <dbl>
#> 1       2          0.07 2017-10-06 17:00:00 2017-10-06 22:00:00     6
#> 2       4          0.01 2017-10-07 15:00:00 2017-10-07 15:00:00     1

Upvotes: 3

Related Questions