user2805568
user2805568

Reputation: 297

insert elements in a vector in R

I have a vector in R,

a = c(2,3,4,9,10,2,4,19)

let us say I want to efficiently insert the following vectors, b, and c,

b = c(2,1)
d = c(0,1)

right after the 3rd and 7th positions (the "4" entries), resulting in,

e = c(2,3,4,2,1,9,10,2,4,0,1,19)

How would I do this efficiently in R, without recursively using cbind or so.

I found a package R.basic but its not part of CRAN packages so I thought about using a supported version.

Upvotes: 23

Views: 43067

Answers (6)

Itamar
Itamar

Reputation: 2141

The straightforward approach:

b.pos <- 3
d.pos <- 7
c(a[1:b.pos],b,a[(b.pos+1):d.pos],d,a[(d.pos+1):length(a)])
[1]  2  3  4  2  1  9 10  2  4  0  1 19

Note the importance of parenthesis for the boundaries of the : operator.

Upvotes: 5

Tutur Qhuhuit
Tutur Qhuhuit

Reputation: 51

After using Ferdinand's function, I tried to write my own and surprisingly it is far more efficient.
Here's mine :

insertElems = function(vect, pos, elems) {

l = length(vect)
  j = 0
  for (i in 1:length(pos)){
    if (pos[i]==1)
      vect = c(elems[j+1], vect)
    else if (pos[i] == length(vect)+1)
      vect = c(vect, elems[j+1])
    else
      vect = c(vect[1:(pos[i]-1+j)], elems[j+1], vect[(pos[i]+j):(l+j)])
    j = j+1
  }
  return(vect)
}

tmp = c(seq(1:5))
insertElems(tmp, c(2,4,5), c(NA,NA,NA))
# [1]  1 NA  2  3 NA  4 NA  5

insert.at(tmp, c(2,4,5), c(NA,NA,NA))
# [1]  1 NA  2  3 NA  4 NA  5

And there's the benchmark result :

> microbenchmark(insertElems(tmp, c(2,4,5), c(NA,NA,NA)), insert.at(tmp, c(2,4,5), c(NA,NA,NA)), times = 10000)
Unit: microseconds
                                        expr    min     lq     mean median     uq      max neval
 insertElems(tmp, c(2, 4, 5), c(NA, NA, NA))  9.660 11.472 13.44247  12.68 13.585 1630.421 10000
   insert.at(tmp, c(2, 4, 5), c(NA, NA, NA)) 58.866 62.791 70.36281  64.30 67.923 2475.366 10000

my code works even better for some cases :

> insert.at(tmp, c(1,4,5), c(NA,NA,NA))
# [1]  1  2  3 NA  4 NA  5 NA  1  2  3
# Warning message:
# In result[c(TRUE, FALSE)] <- split(a, cumsum(seq_along(a) %in% (pos))) :
#   number of items to replace is not a multiple of replacement length

> insertElems(tmp, c(1,4,5), c(NA,NA,NA))
# [1] NA  1  2  3 NA  4 NA  5

Upvotes: 3

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193497

Here's an alternative that uses append. It's fine for small vectors, but I can't imagine it being efficient for large vectors since a new vector is created upon each iteration of the loop (which is, obviously, bad). The trick is to reverse the vector of things that need to be inserted to get append to insert them in the correct place relative to the original vector.

a = c(2,3,4,9,10,2,4,19)
b = c(2,1)
d = c(0,1)

pos <- c(3, 7)
z <- setNames(list(b, d), pos)
z <- z[order(names(z), decreasing=TRUE)]


for (i in seq_along(z)) {
  a <- append(a, z[[i]], after = as.numeric(names(z)[[i]]))
}

a
#  [1]  2  3  4  2  1  9 10  2  4  0  1 19

Upvotes: 2

Ferdinand.kraft
Ferdinand.kraft

Reputation: 12819

Try this:

result <- vector("list",5)
result[c(TRUE,FALSE)] <- split(a, cumsum(seq_along(a) %in% (c(3,7)+1)))
result[c(FALSE,TRUE)] <- list(b,d)
f <- unlist(result)

identical(f, e)
#[1] TRUE

EDIT: generalization to arbitrary number of insertions is straightforward:

insert.at <- function(a, pos, ...){
    dots <- list(...)
    stopifnot(length(dots)==length(pos))
    result <- vector("list",2*length(pos)+1)
    result[c(TRUE,FALSE)] <- split(a, cumsum(seq_along(a) %in% (pos+1)))
    result[c(FALSE,TRUE)] <- dots
    unlist(result)
}


> insert.at(a, c(3,7), b, d)
 [1]  2  3  4  2  1  9 10  2  4  0  1 19

> insert.at(1:10, c(4,7,9), 11, 12, 13)
 [1]  1  2  3  4 11  5  6  7 12  8  9 13 10

> insert.at(1:10, c(4,7,9), 11, 12)
Error: length(dots) == length(pos) is not TRUE

Note the bonus error checking if the number of positions and insertions do not match.

Upvotes: 17

Frank
Frank

Reputation: 66819

Here's another function, using Ricardo's syntax, Ferdinand's split and @Arun's interleaving trick from another question:

ins2 <- function(a,bs,pos){
    as <- split(a,cumsum(seq(a)%in%(pos+1)))
    idx <- order(c(seq_along(as),seq_along(bs)))
    unlist(c(as,bs)[idx])
}

The advantage is that this should extend to more insertions. However, it may produce weird output when passed invalid arguments, e.g., with any(pos > length(a)) or length(bs)!=length(pos).

You can change the last line to unname(unlist(... if you don't want a's items named.

Upvotes: 6

Ricardo Saporta
Ricardo Saporta

Reputation: 55340

You can use the following function,

ins(a, list(b, d), pos=c(3, 7))
# [1]  2  3  4  2  1  9 10  2  4  0  1  4 19

where:

ins <- function(a, to.insert=list(), pos=c()) {

  c(a[seq(pos[1])], 
    to.insert[[1]], 
    a[seq(pos[1]+1, pos[2])], 
    to.insert[[2]], 
    a[seq(pos[2], length(a))]
    )
}

Upvotes: 12

Related Questions