Reputation: 3555
I'm trying to count the number of precipitation below a certain threshold (let's say less or equal than 50) between two dates.
Basically, I have a vector cuts
that contains the dates that I want to count between inclusively. I want to use the cuts
vector to "subset" the dataset in different bins and than count the number of events where it was raining less than 50 mm of rain.
I'm using dplyr and a for loop at the moment, but nothing is working.
set.seed(12345)
df = data.frame(date = seq(as.Date("2000/03/01"), as.Date("2002/03/01"), "days"),
precipitation = rnorm(length(seq(as.Date("2000/03/01"), as.Date("2002/03/01"), "days")),80,20))
cuts = c("2001-11-25","2002-01-01","2002-02-18","2002-03-01")
for (i in 1:length(cuts)) {
df %>% summarise(count.prec = if (date > cuts[i] | date < cuts[i+1]) {count(precipitation <= 50)})
}
But I have this error message:
Error: no applicable method for 'group_by_' applied to an object of class "logical"
In addition: Warning message:
In if (c(11017, 11018, 11019, 11020, 11021, 11022, 11023, 11024, :
the condition has length > 1 and only the first element will be used
This is not working either:
for (i in 1:length(cuts)) {
df %>% if (date > cuts[i] | date < cuts[i+1])%>% summarise(count.prec = count(precipitation <= 50))
}
Upvotes: 3
Views: 1442
Reputation: 21621
You could try:
df %>%
group_by(gr = cut(date, breaks = as.Date(cuts))) %>%
summarise(res = sum(precipitation <= 50))
Which gives:
# A tibble: 4 × 2
gr res
<fctr> <int>
1 2001-11-25 1
2 2002-01-01 4
3 2002-02-18 2
4 NA 40
Or as per mentioned by @Frank - you could replace summarise()
by tally(precipitation <= 50)
Upvotes: 5
Reputation: 886938
We can try with non-equi join using data.table
library(data.table)#v1.9.7+
df2 <- data.table(cuts1 = as.Date(cuts[-length(cuts)]), cuts2 = as.Date(cuts[-1]))
setDT(df)[df2, .(Count = sum(precipitation <=50)),
on = .(date > cuts1, date < cuts2), by = .EACHI]
# date date Count
#1: 2001-11-25 2002-01-01 1
#2: 2002-01-01 2002-02-18 4
#3: 2002-02-18 2002-03-01 2
Upvotes: 1