Reputation: 13
I am trying to assess the number of dry spells in a given data set.
Here is a sample of the data (precip refers to total daily precipitation)
date precip index
1976-01-15 11.4 0
1976-01-16 10.3 0
1976-01-17 3.2 0
1976-01-18 0.0 1
1976-01-19 1.2 0
1976-01-20 1.7 0
1976-01-21 3.1 0
1976-01-22 9.2 0
1976-01-23 4.6 0
1976-01-24 1.9 0
1976-01-25 0.0 1
1976-01-26 0.1 1
1976-01-27 0.2 0
1976-01-28 0.0 1
1976-01-29 0.0 1
1976-01-30 0.0 1
The column index has been created to separate dry days from wet days. Dry days are defined as days when the total precipiation does not exceeding 0.2mm, indexed by 1. Days exceeding 0.2mm are considered wet and indexed as 0.
However, i would like to add another condition, if one day is relatively dry (precipitation does no exceed 1mm) and falls between days which are dry (<0.2 mm) the dry spells continues. (like for example 1976-01-27 would be considered dry day, and the there would be one dry spell of 6 days, instead of two shorter ones).
date precip index
1976-01-15 11.4 0
1976-01-16 10.3 0
1976-01-17 3.2 0
1976-01-18 0.0 1
1976-01-19 1.2 0
1976-01-20 1.7 0
1976-01-21 3.1 0
1976-01-22 9.2 0
1976-01-23 4.6 0
1976-01-24 1.9 0
1976-01-25 0.0 1
1976-01-26 0.1 1
1976-01-27 0.2 1
1976-01-28 0.0 1
1976-01-29 0.0 1
1976-01-30 0.0 1
I would really apreciate any help or suggestions. Thank you very much! :)
Upvotes: 1
Views: 76
Reputation: 70266
Using dplyr's lag and lead functions you could do:
require(dplyr)
mutate(df, index2 = (index | (precip <= 1 & lag(index) & lead(index))) + 0L)
# date precip index index2
#1 1976-01-15 11.4 0 0
#2 1976-01-16 10.3 0 0
#3 1976-01-17 3.2 0 0
#4 1976-01-18 0.0 1 1
#5 1976-01-19 1.2 0 0
#6 1976-01-20 1.7 0 0
#7 1976-01-21 3.1 0 0
#8 1976-01-22 9.2 0 0
#9 1976-01-23 4.6 0 0
#10 1976-01-24 1.9 0 0
#11 1976-01-25 0.0 1 1
#12 1976-01-26 0.1 1 1
#13 1976-01-27 0.2 0 1
#14 1976-01-28 0.0 1 1
#15 1976-01-29 0.0 1 1
#16 1976-01-30 0.0 1 1
The +0L
turns the logical values (TRUE/FALSE) into integer values (1/0).
Upvotes: 2