Reputation: 133
I have the following data frame and I want to repeat it N times
dc <- read.table(text = "from 1 2 3 4 5
1 0.01 0.02 0.03 0.04 0.05
2 0.06 0.07 0.08 0.09 0.10
3 0.11 0.12 0.13 0.14 0.15
4 0.16 0.17 0.18 0.19 0.20
5 0.21 0.22 0.23 0.24 0.25", header = TRUE)
n<-20
ddr <- NA
for(i in 1:n) {
ddr <- rbind(ddr, cbind(dc,i))
}
As a result, I would like to receive:
from X1 X2 X3 X4 X5 i
1 0.01 0.02 0.03 0.04 0.05 1
2 0.06 0.07 0.08 0.09 0.10 1
3 0.11 0.12 0.13 0.14 0.15 1
4 0.16 0.17 0.18 0.19 0.20 1
5 0.21 0.22 0.23 0.24 0.25 1
1 0.01 0.02 0.03 0.04 0.05 2
2 0.06 0.07 0.08 0.09 0.10 2
3 0.11 0.12 0.13 0.14 0.15 2
4 0.16 0.17 0.18 0.19 0.20 2
5 0.21 0.22 0.23 0.24 0.25 2
.............................
1 0.01 0.02 0.03 0.04 0.05 20
2 0.06 0.07 0.08 0.09 0.10 20
3 0.11 0.12 0.13 0.14 0.15 20
4 0.16 0.17 0.18 0.19 0.20 20
5 0.21 0.22 0.23 0.24 0.25 20
The matrix must be repeated N times, and repeat number is added.
Is there a correct solution (easy function to do this in R) to this issue? In my case if the ddr is not declared (ddr<-NA), the script does not work. Thanks!
Upvotes: 4
Views: 7392
Reputation: 4224
Here is also a more intuitive way, about identical in speed to the other top answer:
n <- 3
data.frame(df,i=rep(1:n,ea=NROW(df)))
Output (repeated 3x):
from X1 X2 X3 X4 X5 i
1 1 0.01 0.02 0.03 0.04 0.05 1
2 2 0.06 0.07 0.08 0.09 0.10 1
3 3 0.11 0.12 0.13 0.14 0.15 1
4 4 0.16 0.17 0.18 0.19 0.20 1
5 5 0.21 0.22 0.23 0.24 0.25 1
6 1 0.01 0.02 0.03 0.04 0.05 2
7 2 0.06 0.07 0.08 0.09 0.10 2
8 3 0.11 0.12 0.13 0.14 0.15 2
9 4 0.16 0.17 0.18 0.19 0.20 2
10 5 0.21 0.22 0.23 0.24 0.25 2
11 1 0.01 0.02 0.03 0.04 0.05 3
12 2 0.06 0.07 0.08 0.09 0.10 3
13 3 0.11 0.12 0.13 0.14 0.15 3
14 4 0.16 0.17 0.18 0.19 0.20 3
15 5 0.21 0.22 0.23 0.24 0.25 3
EDIT: Top Answer Speed Test
This test was scaled up to n=1e+05, iterations=100:
func1 <- function(){
data.frame(df,i=rep(1:n,ea=NROW(df)))
}
func2 <- function(){
cbind(dc, i = rep(1:n, each = nrow(dc)))
}
func3 <- function(){
cbind(dc[rep(1:nrow(dc), n), ], i = rep(1:n, each = nrow(dc)))
}
microbenchmark::microbenchmark(
func1(),func2(),func3())
Unit: milliseconds
expr min lq mean median uq max neval cld
func1() 15.58709 21.69143 28.62695 22.01692 23.85648 117.9012 100 a
func2() 15.99023 21.59375 28.37328 22.18298 23.99953 136.1209 100 a
func3() 414.18741 436.51732 473.14571 453.26099 498.21576 666.8515 100 b
Upvotes: 4
Reputation: 99321
You can use rep()
to replicate the row indexes, and also to create the repeat number column.
cbind(dc[rep(1:nrow(dc), n), ], i = rep(1:n, each = nrow(dc)))
Let's break it down:
dc[rep(1:nrow(dc), n), ]
uses replicated row indexes in the i
value of row indexing of [
for data framesrep(1:n, each = nrow(dc))
replicates a sequence the length of the n
value nrow(dc)
times eachcbind(...)
combines the two into a single data frameAs @HubertL points out in the comments, this can be further simplified to
cbind(dc, i = rep(1:n, each = nrow(dc)))
thanks to the magic of recycling. Please go give him a vote.
Upvotes: 6