Reputation: 25
I would like to know if it is possible using dplyr to count the number of "visits". A visit is defined by every time the elapsed time is more than 30 or when the species change.
I tried grouping them by species
visit <- rawdata %>%
group_by(Species)%>%
mutate(VisitNo = cumsum(Elapsed >= 30)
But this makes that every species starts again at 0
Species Elapsed VisitNo
aardvark 5
aardvark 10
aardvark 2
aardvark 30
aardvark 4
aardvark 30
aardvark 10
Jackal 5
Jackal 30
Impala 5
Impala 30`
expected output
Species Elapsed VisitNo
aardvark 5 1
aardvark 10 1
aardvark 2 1
aardvark 30 2
aardvark 4 2
aardvark 30 3
aardvark 10 3
Jackal 5 4
Jackal 30 5
Impala 5 5
Impala 30 6
Thanks for the help.
Upvotes: 1
Views: 29
Reputation: 388797
Another option :
library(dplyr)
df %>%
mutate(VisitNo = cumsum(Species != lag(Species, default = last(Species)) |
Elapsed >= 30))
# Species Elapsed VisitNo
#1 aardvark 5 1
#2 aardvark 10 1
#3 aardvark 2 1
#4 aardvark 30 2
#5 aardvark 4 2
#6 aardvark 30 3
#7 aardvark 10 3
#8 Jackal 5 4
#9 Jackal 30 5
#10 Impala 5 6
#11 Impala 30 7
Similarly with data.table
library(data.table)
setDT(df)[, VisitNo := cumsum(Species != shift(Species, fill = last(Species)) |
Elapsed >= 30)]
Upvotes: 1
Reputation: 39858
You can do:
df %>%
mutate(VisitNo = cumsum(!duplicated(Species) | Elapsed >= 30))
Species Elapsed VisitNo
1 aardvark 5 1
2 aardvark 10 1
3 aardvark 2 1
4 aardvark 30 2
5 aardvark 4 2
6 aardvark 30 3
7 aardvark 10 3
8 Jackal 5 4
9 Jackal 30 5
10 Impala 5 6
11 Impala 30 7
Upvotes: 2