Making an implicit data value explicit in R

Question

I'm working with a bit of a bizarre data format, from which there's a bit of information I'd like to extract. I've done some of this work in python, but I was curious if it's crazy to attempt some of this in R.

The format is like this: Three columns:

Column 1 is a timestamp
Column 2 is an observation type (a factor)
Column 3 is a group indicator

The oddness of this format comes from certain logical constraints that come from the ordering.

Some of the observations are start / stop pairs (no nesting, but other observations could be inside the start-stop pairs)
The timestamps are generally monotonically increasing -- except that a fourth group indicator is implicitly provided by the ordering of monotonically increasing chunks.

In case that's not clear, consider the following sequence of timestamps:
```
0 1 2 3 4 5 1 2 3 4 1 2 3 4 5 
      0    |    1  |    2  
```
This is 3 chunks, where the first chunk all have the second group indicator of 0, the second group have indicator 1, etc.

Everything becomes a bit easier if I can make that 4th indicator explicit; This is easy enough in python, but I was hoping to keep this all in R.

A straight loop over the data would be easy enough, but I figure there may be a faster more Rish --- vector-- way to do this.

G. Grothendieck · Accepted Answer

Try this:

tt <- c(0, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5)
grp <- cumsum(c(FALSE, diff(tt) < 0))

which gives:

> grp
[1] 0 0 0 0 0 0 1 1 1 1 2 2 2 2 2

Making an implicit data value explicit in R

Answers (1)

Related Questions