Reputation: 449
Working with data.table
package in R, I'm trying to get the 'group number' of some data points.
Specifically, my data is trajectories: I have many rows describing a specific observation of the particle I'm tracking, and I want to generate a specific index for the trajectory based on other identifying information I have.
If I do a [, , by]
command, I can group my data by this identifying information and isolate each trajectory.
Is there a way, similar to .I
or .N
, which gives what I would call the index of the subset?
Here's an example with toy data:
dt <- data.table(x1 = c(rep(1,4), rep(2,4)),
x2 = c(1,1,2,2,1,1,2,2),
z = runif(8))
I need a fast way to get the trajectories (here should be c(1,1,2,2,3,3,4,4)
for each observation -- my real data set is moderately large.
Upvotes: 3
Views: 88
Reputation: 886968
If we need the trajectories
(donno what that means) based on the 'x2', we can use rleid
dt[, Grp := rleid(x2)]
Or if we need the group numbers based on 'x1' and 'x2', .GRP
can be used.
dt[, Grp := .GRP,.(x1, x2)]
Or this can be done using rleid
alone without the by
(as @Frank mentioned)
dt[, Grp := rleid(x1,x2)]
Upvotes: 3