Reputation: 435
I want to create a unique sequential numeric ID for each distinct group based on 3 columns, but for each group the IDs must start from 1 to n.
Using the solution at Creating a unique ID, I can create unique IDs, but they are sequential for the entire data frame.
k1 <- c(1,1,1,1,1,1,1,1,1,1)
k2 <- c(1,1,1,1,1,2,2,2,2,2)
k3 <- rep(letters[1:2],5)
df <- as.data.frame(cbind(k1,k2, k3))
d <- transform(df, id = as.numeric(interaction(k1,k2,k3, drop=TRUE)))
d <- d[with(d, order(k1,k2,k3)),]
the result is
> d k1 k2 k3 id 1 1 1 a 1 3 1 1 a 1 5 1 1 a 1 2 1 1 b 3 4 1 1 b 3 7 1 2 a 2 9 1 2 a 2 6 1 2 b 4 8 1 2 b 4 10 1 2 b 4
and I'd like to have
> d k1 k2 k3 id 1 1 1 a 1 3 1 1 a 1 5 1 1 a 1 2 1 1 b 2 4 1 1 b 2 7 1 2 a 1 9 1 2 a 1 6 1 2 b 2 8 1 2 b 2 10 1 2 b 2
Upvotes: 1
Views: 1369
Reputation: 31161
Try using data.table
as mentioned in the link:
library(data.table)
setDT(df)[,id:=.GRP,by=list(k1,k3)][]
# k1 k2 k3 id
# 1: 1 1 a 1
# 2: 1 1 b 2
# 3: 1 1 a 1
# 4: 1 1 b 2
# 5: 1 1 a 1
# 6: 1 2 b 2
# 7: 1 2 a 1
# 8: 1 2 b 2
# 9: 1 2 a 1
#10: 1 2 b 2
Upvotes: 3
Reputation: 886948
Try
d$id <- with(d, ave(id, k2, FUN=function(x) as.numeric(factor(x))))
d$id
#[1] 1 1 1 2 2 1 1 2 2 2
Upvotes: 2