Reputation: 81

Replicate rows in a R data frame and index new duplicates

How to replicate rows in a R dataframe based on a condition is explained here:

Replicate each row of data.frame and specify the number of replications for each row

But I also want to index new duplicates, in a new column. I.e the first duplicate would be indexed 1, the second 2, the third 3...

Considering the dataframe:

df <- data.frame(var1=c('a', 'b', 'c'), var2=c('d', 'e', 'f'), freq=1:3)

>df
  var1 var2 freq
1    a    d    1
2    b    e    2
3    c    f    3

I can duplicate each row based on the column freq this way:

df.expanded <- df[rep(row.names(df), df$freq),]

>df.expanded
  var1 var2 freq
1    a    d    1
2    b    e    2
3    b    e    2
4    c    f    3
5    c    f    3
6    c    f    3

What I want is having also an index to differentiate these new duplicates, like this:

>df.expanded
  var1 var2 freq  ind
1    a    d    1    1
2    b    e    2    1
3    b    e    2    2
4    c    f    3    1
5    c    f    3    2
6    c    f    3    3

Thanks.

Upvotes: 0

Answers (2)

Roland

Reputation: 132969

df.expanded$ind <- sequence(df$freq)
#    var1 var2 freq ind
#1      a    d    1   1
#2      b    e    2   1
#2.1    b    e    2   2
#3      c    f    3   1
#3.1    c    f    3   2
#3.2    c    f    3   3

Upvotes: 1

Robert Krzyzanowski

Reputation: 9344

df.expanded$ind <- unlist(sapply(df$freq, seq_len))

Upvotes: 2

Replicate rows in a R data frame and index new duplicates

Answers (2)

Related Questions