Reputation: 591
I'm trying to generate predicted values from a large number of model simulations and I'm having a hard time doing it simply. I suspect I need something from the apply() family, but I can't figure out the syntax. Maybe my knowledge of apply() is weak. Or maybe my function is wrong. Any suggestions?
Suppose I've got following coefficients resulting from six model simulations...
coef <- data.frame(intercept=c(2,3,5,7,2,1),
b1 = c(.2,.5,.6,.7,.9,.4),
b2 = c(10,11,12,11,9,10))
I want to compute (predicted values or) the linear combination of each row above and each row of the following data frame...
df <- data.frame(age = c(50,20,19, 42),
height = c(60,72,79, 66))
...Using the following model equation:
coef$intercept + coef$b1*df$age + coef$b2*df$height
Done right, I should get the following 24 data values:
612.0 726.0 795.8 670.4
688.0 805.0 881.5 750
755.0 881.0 964.4 822.2
702.0 813.0 889.3 762.4
587.0 668.0 730.1 633.8
621.0 729.0 798.6 677.8
To get the above, I've tried the following function and use of apply()...
equation <- function(...) coef$intercept + coef$b1*df$age + coef$b2*df$height
result <- apply(df, 1, equation)
...but I don't get the correct answer. The "result" data frame just repeats the correct diagonals. I also get the message:
> Warning messages: 1: In coef$b1 * df$age : longer object length is
> not a multiple of shorter object length
Yes I can get the correct answer through simple matrix multiplication:
df$ones <- 1
df <- df[,c(3, 1, 2)]
result <- as.matrix(coef) %*% t(as.matrix(df))
But it seems to me one ought to be able to do this more generally using apply() and a custom function. Use of apply() is more compact and puts me less at risk of having my matrix columns in the wrong order. Any suggestions?
Upvotes: 0
Views: 1434
Reputation: 1718
Here is what I'd do:
sapply(seq_along(1:nrow(coef)), function(x){
sapply(seq_along(1:nrow(df)), function(y) {
coef$intercept[[x]] + coef$b1[[x]]*df$age[[y]] + coef$b2[[x]]*df$height[[y]]
})
})
Result:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 612.0 688.0 755.0 702.0 587.0 621.0
[2,] 726.0 805.0 881.0 813.0 668.0 729.0
[3,] 795.8 881.5 964.4 889.3 730.1 798.6
[4,] 670.4 750.0 822.2 762.4 633.8 677.8
Use two sapplys. One for each object (df
and coef
).
Upvotes: 1
Reputation: 887223
We can do this with %*%
coef[,1] + as.matrix(coef[-1]) %*% t(df)
# [,1] [,2] [,3] [,4]
#[1,] 612 726 795.8 670.4
#[2,] 688 805 881.5 750.0
#[3,] 755 881 964.4 822.2
#[4,] 702 813 889.3 762.4
#[5,] 587 668 730.1 633.8
#[6,] 621 729 798.6 677.8
Upvotes: 3
Reputation: 1709
If you really want to use apply, you can do this:
result<- t(apply(coef, 1, function(x) x[1] + x[2]*df$age + x[3]*df$height))
> result
[,1] [,2] [,3] [,4]
[1,] 612 726 795.8 670.4
[2,] 688 805 881.5 750.0
[3,] 755 881 964.4 822.2
[4,] 702 813 889.3 762.4
[5,] 587 668 730.1 633.8
[6,] 621 729 798.6 677.8
But it's really preferable (and faster) to do the matrix multiplication.
Upvotes: 3