Reputation: 1363
I have the following data.table, and I wish to multiply column a by column b, a is always a single number and b column may sometimes be a vector:
library(data.table)
tt <- c(33,44)
dt <- data.table(a=list(1,2,3)
, b = list(11,22,tt))
dt[, t2 := sapply(b, function(x) x*a)]
I get an error: Error in x * a : non-numeric argument to binary operator
Because a is always a single number, I expected that row 3 would work, even though b is a vector.
The solution I found is to use mapply:
dt[, t2 := mapply(function(x,y) x*y, a, b)]
Why it does not work with sapply/lapply?
Upvotes: 1
Views: 84
Reputation: 160447
Does dt$a[[1]] * dt$b
work? (No.) While the first argument is a vector of length 1, the second is not a vector, it is a list
, and list
s don't do arithmetic. sapply
only iterates over one list/vector of values, so while sapply(a, function(AA) AA * b)
might seem like a good start, b
still reflects a list
so cannot be done.
What you are trying to do is multiply a[[1]]
with b[[1]]
, then a[[2]]
with b[[2]]
, etc. That is what Map
and mapply
do well.
Some things about how they relate.
## equivalent
lapply(lst1, function(z) z + 1)
Map(function(z) z + 1, lst1)
## equivalent
sapply(lst1, function(z) z + 1)
mapply(function(z) z + 1, lst1)
That's it for single-vector processing. But when you want to iterate over multiple (two or more) vectors/lists at the same time, "zipping" them together, there are two options:
stopifnot(length(lst1) == length(lst2))
## equivalent
sapply(seq_along(lst1), function(ind) {
lst1[[ind]] * lst2[[ind]]
})
mapply(function(o1, o2) o1 * o2, lst1, lst2)
mapply(`*`, lst1, lst2)
Commonalities and differences to know about them:
sapply
and mapply
will try to simplify the return value if possible, so they might return a vector
(if the return value is 1), a matrix
(if the return value is a vector), or a list
(if any length is different from the others). You can force a list
with sapply(..., simplify=FALSE)
and mapply(..., SIMPLIFY=FALSE)
(case difference is important).lapply
and Map
always return list
s, regardless of the above conditions; many find this output consistency more reliable/desirable in a programmatic sense (i.e., in functions/packages).lapply
will only return a named list
if the vector/list is named, otherwise it is only indexable positionally; all of the others will auto-name the returned list
if the input is named or if the input is character
. (There might be more rules/exceptions to this, but it's a start.)Upvotes: 1