Reputation: 81
How to replicate rows in a R dataframe based on a condition is explained here:
Replicate each row of data.frame and specify the number of replications for each row
But I also want to index new duplicates, in a new column. I.e the first duplicate would be indexed 1, the second 2, the third 3...
Considering the dataframe:
df <- data.frame(var1=c('a', 'b', 'c'), var2=c('d', 'e', 'f'), freq=1:3)
>df
var1 var2 freq
1 a d 1
2 b e 2
3 c f 3
I can duplicate each row based on the column freq this way:
df.expanded <- df[rep(row.names(df), df$freq),]
>df.expanded
var1 var2 freq
1 a d 1
2 b e 2
3 b e 2
4 c f 3
5 c f 3
6 c f 3
What I want is having also an index to differentiate these new duplicates, like this:
>df.expanded
var1 var2 freq ind
1 a d 1 1
2 b e 2 1
3 b e 2 2
4 c f 3 1
5 c f 3 2
6 c f 3 3
Thanks.
Upvotes: 0
Views: 739
Reputation: 132969
df.expanded$ind <- sequence(df$freq)
# var1 var2 freq ind
#1 a d 1 1
#2 b e 2 1
#2.1 b e 2 2
#3 c f 3 1
#3.1 c f 3 2
#3.2 c f 3 3
Upvotes: 1