Reputation: 1350
I'm working with a bit of a bizarre data format, from which there's a bit of information I'd like to extract. I've done some of this work in python, but I was curious if it's crazy to attempt some of this in R.
The format is like this: Three columns:
The oddness of this format comes from certain logical constraints that come from the ordering.
The timestamps are generally monotonically increasing -- except that a fourth group indicator is implicitly provided by the ordering of monotonically increasing chunks.
In case that's not clear, consider the following sequence of timestamps:
0 1 2 3 4 5 1 2 3 4 1 2 3 4 5
0 | 1 | 2
This is 3 chunks, where the first chunk all have the second group indicator of 0, the second group have indicator 1, etc.
Everything becomes a bit easier if I can make that 4th indicator explicit; This is easy enough in python, but I was hoping to keep this all in R.
A straight loop over the data would be easy enough, but I figure there may be a faster more Rish --- vector-- way to do this.
Upvotes: 0
Views: 132
Reputation: 270010
Try this:
tt <- c(0, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5)
grp <- cumsum(c(FALSE, diff(tt) < 0))
which gives:
> grp
[1] 0 0 0 0 0 0 1 1 1 1 2 2 2 2 2
Upvotes: 1