Reputation: 111
This is a continuation of r - How to add row index to a data frame, based on combination of factors
I tried to replicate what I believe to be the desired results using the green checked answer and am consistently getting something other than expected. I am sure I am doing something really basic wrong, but can't seem to see it OR I've misunderstood what the desired state is.
The data from the original post:
temp <- data.frame(
Dim1 = c("A","A","A","A","A","A","B","B"),
Dim2 = c(100,100,100,100,200,200,100,200),
Value = sample(1:10, 8)
)
Then I ran the following code: temp$indexLength <- ave( 1:nrow(temp), temp$Dim1, factor( temp$Dim2), FUN=function(x) 1:length(x) )
and: temp$indexSeqAlong <- ave( 1:nrow(temp), temp$Dim1, factor( temp$Dim2), FUN=seq_along )
and then I created the following: temp$indexDesired <- c(1, 1, 1, 1, 2, 2, 3, 3)
...ending up with the data frame below:
Dim1 Dim2 Value indexLength indexSeqAlong indexDesired
1 A 100 6 1 1 1
2 A 100 2 2 2 1
3 A 100 9 3 3 1
4 A 100 8 4 4 1
5 A 200 10 1 1 2
6 A 200 4 2 2 2
7 B 100 3 1 1 3
8 B 200 5 1 1 4
If I can figure out what I'm not getting the desired index -- and assuming the code is extensible to more than 2 variables -- I should be all set. Thanks in advance!
Upvotes: 2
Views: 6458
Reputation: 115392
If you use data.table
, there is a "symbol" .GRP
which records this information ( a simple group counter)
library(data.table)
DT <- data.table(temp)
DT[, index := .GRP, by = list(Dim1, Dim2)]
DT
# Dim1 Dim2 Value index
# 1: A 100 10 1
# 2: A 100 2 1
# 3: A 100 9 1
# 4: A 100 4 1
# 5: A 200 6 2
# 6: A 200 1 2
# 7: B 100 8 3
# 8: B 200 7 4
Upvotes: 3
Reputation: 263342
Once the values in teh first argument have been partitioned, there is no way that ave "knows" what order they have been passed. You want a method that can look at changes in values. The duplicated
function is generic and has a data.frame method that looks at multiple columns:
temp$indexSeqAlong <- cumsum(!duplicated(temp[, 1:2]) )
temp
Dim1 Dim2 Value indexSeqAlong
1 A 100 8 1
2 A 100 2 1
3 A 100 7 1
4 A 100 3 1
5 A 200 5 2
6 A 200 1 2
7 B 100 4 3
8 B 200 10 4
Is extensible to as many columns as you want.
Upvotes: 1