Reputation: 323
I have foreach
loop that produces a list within each loop and a .combine
function to combine them that looks like this:
mergelists = function(x,xn) {
padlen = length(x[[1]])
for (n in names(x)[!names(x) %in% names(xn)]) xn[[n]] = 0
for (n in names(xn)[!names(xn) %in% names(x)]) xn[[n]] = c(rep(0,padlen), xn[[n]])
for (idx in names(xn)) { x[[idx]] = c( x[[idx]], xn[[idx]] ) }
x
}
The first two for-loops modify the new list (xn
) to make it compatible to the the one that gathers the results (x
). The last one joins x
and xn
onto x
.
I believe my code is ridiculously inefficient, because it re-allocates a lot and uses for-loops. But I can't think about a better solution. Any ideas?
Some more explanation: I don't know the list names in advance (they are patterns from a bootstrap exercise which takes place in the foreach part).
Example:
> x
$foo
[1] 3 2
$bar
[1] 3 2
and
> xn
$foo
[1] 1
$baz
[1] 1
should join to
> x
$foo
[1] 3 2 1
$bar
[1] 3 2 0
$baz
[1] 0 0 1
That's it.
Upvotes: 4
Views: 852
Reputation: 72739
If foo and bar exist in every list and are in order, then mapply works. As @BenBarnes suggested, having a pre-processing step to create the 0's makes this a viable option even if they do not exist everywhere. Sorting is easy. I've changed the 0's to NAs since that seems more appropriate.
# Make data
x <- list(foo=c(3,2),bar=c(6,7))
xn <- list(foo=c(1),bar=c(1),aught=c(5,2))
lol <- list(x=x,xn=xn)
# Pre-process
allnames <- sort(unique(unlist(lapply(lol, names))))
cleanlist <- function(l,allnames) {
ret <- l[allnames]
names(ret) <- allnames
ret[sapply(ret,is.null)] <- NA
ret
}
lol <- lapply(lol,cleanlist,allnames=allnames)
# Combine
do.call("mapply", c(c,lol) )
Which produces:
aught bar foo
x NA 6 3
xn1 5 7 2
xn2 2 1 1
Benchmarking
That said, if you're hoping for speed gains, the original version is still the fastest, presumably because it does the least. But the loopless approach is pretty elegant and scales to an arbitrary number of x's.
library(microbenchmark)
microbenchmark( mergelists(lol$x,lol$xn), mergeList2(lol$x,lol$xn), do.call("mapply", c(c,lol) ) )
Unit: microseconds
expr min lq median uq max
1 do.call("mapply", c(c, lol)) 155.048 159.5175 192.0635 195.5555 245.841
2 mergeList2(lol$x, lol$xn) 19938.288 20095.9905 20225.4750 20719.6730 27143.674
3 mergelists(lol$x, lol$xn) 63.416 68.1650 78.0825 84.3680 95.265
Upvotes: 3
Reputation: 19454
In my benchmarking, this approach takes longer than your approach, but since I already worked it out, I thought I'd post it anyway. Here's to doubling effort. If the names are completely unknown and you are forced to pad with zeros in the .combine
function, you could try the following. (perhaps try it on a subset of your iterations first to see if it works):
library(reshape2)
mergeList2 <- function(x, xn) {
xDF <- data.frame(ID = seq_along(x[[1]]), x)
xnDF <- data.frame(ID = seq_along(xn[[1]]) + nrow(xDF), xn)
meltedX <- melt(xDF, id = "ID")
meltedXN <- melt(xnDF, id = "ID")
res <- as.list(dcast(rbind(meltedX, meltedXN), ID ~ variable,
fill = 0))[-1]
return(res)
}
Your example:
mergeList2(list(foo = c(3, 2), bar = c(3, 2)),
list(foo = 1, baz= 1))
# $foo
# [1] 3 2 1
# $bar
# [1] 3 2 0
# $baz
# [1] 0 0 1
Test it out with a foreach
example
set.seed(1)
foreach(dd = 1:10, .combine = mergeList2) %do% {
theNames <- sample(c("foo", "bar", "baz"), 2)
ans <- as.list(rpois(2, 4))
names(ans) <- theNames
ans
}
# $foo
# [1] 4 7 2 4 0 2 0 4 5 3
# $baz
# [1] 7 0 0 5 3 5 3 4 0 5
# $bar
# [1] 0 5 2 0 5 0 0 0 6 0
Upvotes: 3