Max
Max

Reputation: 109

Repeating elements in a vector with a for loop

I want to make a vector from 3:50 in R, looking like

3 4 4 5 6 6 7 8 8 .. 50 50

I want to use a for loop in a for loop but it's not doing wat I want.

f <- c()
for (i in 3:50) {
  for(j in 1:2) {
    f = c(f, i)
  }
}

What is wrong with it?

Upvotes: 9

Views: 2625

Answers (8)

www
www

Reputation: 39154

A solution based on sapply.

as.vector(sapply(0:23 * 2 + 2, function(x)  x + c(1, 2, 2)))

# [1]  3  4  4  5  6  6  7  8  8  9 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19 20 20 21 22 22 23 24 24 25 26 26
# [37] 27 28 28 29 30 30 31 32 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42 43 44 44 45 46 46 47 48 48 49 50 50

Benchmarking

Here is a comparison of performance for all the current answers. The result shows that cumsum(rep(c(1, 1, 0), 24)) + 2L (m8) is the fastest, while rep(3:50, rep(1:2, 24))(m1) is almost as fast as the m8.

library(microbenchmark)
library(ggplot2)

perf <- microbenchmark(
  m1 = {rep(3:50, rep(1:2, 24))},
  m2 = {rep(3:50, each = 2)[c(TRUE, FALSE, TRUE, TRUE)]},
  m3 = {v <- 3:50; sort(c(v,v[v %% 2 == 0]))},
  m4 = {as.vector(t(cbind(seq(3,49,2),seq(4,50,2),seq(4,50,2))))},
  m5 = {as.vector(sapply(0:23 * 2 + 2, function(x)  x + c(1, 2, 2)))},
  m6 = {sort(c(3:50, seq(4, 50, 2)))},
  m7 = {rep(seq(3, 50, 2), each=3) + c(0, 1, 1)},
  m8 = {cumsum(rep(c(1, 1, 0), 24)) + 2L},
  times = 10000L
)

perf
# Unit: nanoseconds
# expr   min    lq      mean median    uq     max neval
#   m1   514  1028  1344.980   1029  1542  190200 10000
#   m2  1542  2570  3083.716   3084  3085  191229 10000
#   m3 26217 30329 35593.596  31871 34442 5843267 10000
#   m4 43180 48321 56988.386  50891 55518 6626173 10000
#   m5 30843 35984 42077.543  37526 40611 6557289 10000
#   m6 40611 44209 50092.131  46779 50891  446714 10000
#   m7 13879 16449 19314.547  17478 19020 6309001 10000
#   m8     0  1028  1256.715   1028  1542   71454 10000

Upvotes: 9

moodymudskipper
moodymudskipper

Reputation: 47320

Another idea, though not competing in speed with fastest solutions:

mat <- matrix(3:50,nrow=2)
c(rbind(mat,mat[2,]))
# [1]  3  4  4  5  6  6  7  8  8  9 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19 20 20 21 22 22
# [31] 23 24 24 25 26 26 27 28 28 29 30 30 31 32 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42
# [61] 43 44 44 45 46 46 47 48 48 49 50 50

Upvotes: 0

lmo
lmo

Reputation: 38510

Here is a method that combines portions of a couple of the other answers.

rep(seq(3, 50, 2), each=3) + c(0, 1, 1)
 [1]  3  4  4  5  6  6  7  8  8  9 10 10 11 12 12 13 14 14 15 16
[21] 16 17 18 18 19 20 20 21 22 22 23 24 24 25 26 26 27 28 28 29
[41] 30 30 31 32 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42
[61] 43 44 44 45 46 46 47 48 48 49 50 50

Here is a second method using cumsum

cumsum(rep(c(1, 1, 0), 24)) + 2L

This should be very quick.

Upvotes: 4

kangaroo_cliff
kangaroo_cliff

Reputation: 6222

This should do too.

sort(c(3:50, seq(4, 50, 2)))

Upvotes: 3

Jaap
Jaap

Reputation: 83245

Another option is to use an embedded rep:

rep(3:50, rep(1:2, 24))

which gives:

 [1]  3  4  4  5  6  6  7  8  8  9 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19 20 20
[28] 21 22 22 23 24 24 25 26 26 27 28 28 29 30 30 31 32 32 33 34 34 35 36 36 37 38 38
[55] 39 40 40 41 42 42 43 44 44 45 46 46 47 48 48 49 50 50

This utilizes the fact that the times-argument of rep can also be an integer vector which is equal to the length of the x-argument.

You can generalize this to:

s <- 3
e <- 50
v <- 1:2

rep(s:e, rep(v, (e-s+1)/2))

Even another option using a mix of rep and rep_len:

v <- 3:50
rep(v, rep_len(1:2, length(v)))

Upvotes: 16

John Coleman
John Coleman

Reputation: 51998

Here is a loop-free 1 line solution:

> as.vector(t(cbind(seq(3,49,2),seq(4,50,2),seq(4,50,2))))
 [1]  3  4  4  5  6  6  7  8  8  9 10 10 11 12 12 13 14 14 15 16 16 17
[23] 18 18 19 20 20 21 22 22 23 24 24 25 26 26 27 28 28 29 30 30 31 32
[45] 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42 43 44 44 45 46 46
[67] 47 48 48 49 50 50

It forms a matrix whose first column is the odd numbers in the range 3:50 and whose second and third columns are the even numbers in that range and then (by taking the transpose) reads it off row by row.

The problem with your nested loop approach is that the fundamental pattern is one of length 3, repeated 24 times (instead of a pattern of length 2 repeated 50 times). If you wanted to use a nested loop, the outer loop could iterate 24 times and the inner loop 3. The first pass through the outer loop could construct 3,4,4. The second pass could construct 5,6,6. Etc. Since there are 24*3 = 72 elements, you can pre-allocate the vector (by using f <- vector("numeric",74) ) so that you aren't growing it 1 element at a time. The idiom f <- c(f,i) that you are using at each stage copies all of the old elements just to create a new vector which is only 1 element longer. Here there are too few elements for it to really make a difference, but if you try to create large vectors that way the performance can be shockingly bad.

Upvotes: 4

MKR
MKR

Reputation: 20095

The easiest way I can found is in way to create another one containing only even values (based on OP's intention) and then simply join two vectors. The example could be:

v <- 3:50
sort(c(v,v[v %% 2 == 0]))

# [1]  3  4  4  5  6  6  7  8  8  9 10 10 11 12 12 13 14 14 15 16 16
#      17 18 18 19 20 20 21 22 22 23 24 24 25 26 26 27 28 28
#[40] 29 30 30 31 32 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42
#     43 44 44 45 46 46 47 48 48 49 50 50

Upvotes: 5

akraf
akraf

Reputation: 3235

Use the rep function, along with the possibility to use recycling logical indexing ...[c(TRUE, FALSE, TRUE, TRUE)]

rep(3:50, each = 2)[c(TRUE, FALSE, TRUE, TRUE)]

 ## [1]  3  4  4  5  6  6  7  8  8  9 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19
## [26] 20 20 21 22 22 23 24 24 25 26 26 27 28 28 29 30 30 31 32 32 33 34 34 35 36
## [51] 36 37 38 38 39 40 40 41 42 42 43 44 44 45 46 46 47 48 48 49 50 50

If you use a logical vector (TRUE/FALSE) as index (inside [ ]), a TRUE leads to selection of the corresponding element and a FALSE leads to omission. If the logical index vector (c(TRUE, FALSE, TRUE, TRUE)) is shorter than the indexed vector (rep(3:50, each = 2) in your case), the index vector is recyled.

Also a side note: Whenever you use R code like

 x = c(x, something)

or

 x = rbind(x, something)

or similar, you are adopting a C-like programming style in R. This makes your code unnessecarily complex and might lead to low performance and out-of-memory issues if you work with large (say, 200MB+) data sets. R is designed to spare you those low-level tinkering with data structures.

Read for more information about the gluttons and their punishment in the R Inferno, Circle 2: Growing Objects.

Upvotes: 8

Related Questions