Reputation: 297
I have a vector in R,
a = c(2,3,4,9,10,2,4,19)
let us say I want to efficiently insert the following vectors, b, and c,
b = c(2,1)
d = c(0,1)
right after the 3rd and 7th positions (the "4" entries), resulting in,
e = c(2,3,4,2,1,9,10,2,4,0,1,19)
How would I do this efficiently in R, without recursively using cbind
or so.
I found a package R.basic but its not part of CRAN packages so I thought about using a supported version.
Upvotes: 23
Views: 43067
Reputation: 2141
The straightforward approach:
b.pos <- 3
d.pos <- 7
c(a[1:b.pos],b,a[(b.pos+1):d.pos],d,a[(d.pos+1):length(a)])
[1] 2 3 4 2 1 9 10 2 4 0 1 19
Note the importance of parenthesis for the boundaries of the :
operator.
Upvotes: 5
Reputation: 51
After using Ferdinand's function, I tried to write my own and surprisingly it is far more efficient.
Here's mine :
insertElems = function(vect, pos, elems) {
l = length(vect)
j = 0
for (i in 1:length(pos)){
if (pos[i]==1)
vect = c(elems[j+1], vect)
else if (pos[i] == length(vect)+1)
vect = c(vect, elems[j+1])
else
vect = c(vect[1:(pos[i]-1+j)], elems[j+1], vect[(pos[i]+j):(l+j)])
j = j+1
}
return(vect)
}
tmp = c(seq(1:5))
insertElems(tmp, c(2,4,5), c(NA,NA,NA))
# [1] 1 NA 2 3 NA 4 NA 5
insert.at(tmp, c(2,4,5), c(NA,NA,NA))
# [1] 1 NA 2 3 NA 4 NA 5
And there's the benchmark result :
> microbenchmark(insertElems(tmp, c(2,4,5), c(NA,NA,NA)), insert.at(tmp, c(2,4,5), c(NA,NA,NA)), times = 10000)
Unit: microseconds
expr min lq mean median uq max neval
insertElems(tmp, c(2, 4, 5), c(NA, NA, NA)) 9.660 11.472 13.44247 12.68 13.585 1630.421 10000
insert.at(tmp, c(2, 4, 5), c(NA, NA, NA)) 58.866 62.791 70.36281 64.30 67.923 2475.366 10000
my code works even better for some cases :
> insert.at(tmp, c(1,4,5), c(NA,NA,NA))
# [1] 1 2 3 NA 4 NA 5 NA 1 2 3
# Warning message:
# In result[c(TRUE, FALSE)] <- split(a, cumsum(seq_along(a) %in% (pos))) :
# number of items to replace is not a multiple of replacement length
> insertElems(tmp, c(1,4,5), c(NA,NA,NA))
# [1] NA 1 2 3 NA 4 NA 5
Upvotes: 3
Reputation: 193497
Here's an alternative that uses append
. It's fine for small vectors, but I can't imagine it being efficient for large vectors since a new vector is created upon each iteration of the loop (which is, obviously, bad). The trick is to reverse the vector of things that need to be inserted to get append
to insert them in the correct place relative to the original vector.
a = c(2,3,4,9,10,2,4,19)
b = c(2,1)
d = c(0,1)
pos <- c(3, 7)
z <- setNames(list(b, d), pos)
z <- z[order(names(z), decreasing=TRUE)]
for (i in seq_along(z)) {
a <- append(a, z[[i]], after = as.numeric(names(z)[[i]]))
}
a
# [1] 2 3 4 2 1 9 10 2 4 0 1 19
Upvotes: 2
Reputation: 12819
Try this:
result <- vector("list",5)
result[c(TRUE,FALSE)] <- split(a, cumsum(seq_along(a) %in% (c(3,7)+1)))
result[c(FALSE,TRUE)] <- list(b,d)
f <- unlist(result)
identical(f, e)
#[1] TRUE
EDIT: generalization to arbitrary number of insertions is straightforward:
insert.at <- function(a, pos, ...){
dots <- list(...)
stopifnot(length(dots)==length(pos))
result <- vector("list",2*length(pos)+1)
result[c(TRUE,FALSE)] <- split(a, cumsum(seq_along(a) %in% (pos+1)))
result[c(FALSE,TRUE)] <- dots
unlist(result)
}
> insert.at(a, c(3,7), b, d)
[1] 2 3 4 2 1 9 10 2 4 0 1 19
> insert.at(1:10, c(4,7,9), 11, 12, 13)
[1] 1 2 3 4 11 5 6 7 12 8 9 13 10
> insert.at(1:10, c(4,7,9), 11, 12)
Error: length(dots) == length(pos) is not TRUE
Note the bonus error checking if the number of positions and insertions do not match.
Upvotes: 17
Reputation: 66819
Here's another function, using Ricardo's syntax, Ferdinand's split and @Arun's interleaving trick from another question:
ins2 <- function(a,bs,pos){
as <- split(a,cumsum(seq(a)%in%(pos+1)))
idx <- order(c(seq_along(as),seq_along(bs)))
unlist(c(as,bs)[idx])
}
The advantage is that this should extend to more insertions. However, it may produce weird output when passed invalid arguments, e.g., with any(pos > length(a))
or length(bs)!=length(pos)
.
You can change the last line to unname(unlist(...
if you don't want a's items named.
Upvotes: 6
Reputation: 55340
You can use the following function,
ins(a, list(b, d), pos=c(3, 7))
# [1] 2 3 4 2 1 9 10 2 4 0 1 4 19
where:
ins <- function(a, to.insert=list(), pos=c()) {
c(a[seq(pos[1])],
to.insert[[1]],
a[seq(pos[1]+1, pos[2])],
to.insert[[2]],
a[seq(pos[2], length(a))]
)
}
Upvotes: 12