Sean
Sean

Reputation: 631

How Can I manually obtain predict() values from coef/model.matrix returns on linear model

I am having a LOT of trouble figuring out how to find predicted values from a coefficients of models and the model matrix. I'm hoping someone can help.

I currently have a linear model with two independent variables that I am setting up. e.g.

data <- data.frame(d1,d2,d3)
lm.data <- lm(d1~d2*d3,data)

I can now get the coefficient vector

co.data <- coef(lm.data)

I can also now easily get the model matrix

mm.data <- model.matrix(lm.data)

This is where I can lost!! I am trying to teach myself how I can match the values I can when I use predict(lm.data) with the coefficients. In other words, I know the predicted values of the model from the design matrix and coefficients can be calculated, but after the past 48 hours of working on this, I truly have no idea.

Any help would be amazing.

Upvotes: 5

Views: 8109

Answers (1)

mathematical.coffee
mathematical.coffee

Reputation: 56935

You just need to know how a linear model works. If your formula is d1 ~ d2 * d3 and they're all numeric, then to predict you just do (intercept) + (d2 coefficient)*x_d2 + (d3 coefficient)*x_d3 + (d2:d3 coefficient)*x_d2*x_d3 and that will give you the predicted d1.

here's a reproducible example:

data(iris)
m <- lm(Sepal.Length ~ Petal.Length * Sepal.Width, iris)
co.data <- coef(m)

# we'll predict the sepal length for these petal lengths and sepal widths:
x.pl <- runif(5, min=1, max=2)
x.sw <- runif(5, min=2, max=5)
y.predicted <-  predict(m, data.frame(Petal.Length=x.pl, Sepal.Width=x.sw)) 
#        1        2        3        4        5 
# 5.379006 5.495907 5.296913 4.382487 5.131850 

Now to do it manually, let's look at the coefficients:

co.data
# Intercept)             Petal.Length              Sepal.Width Petal.Length:Sepal.Width 
# 1.40438275               0.71845958               0.84995691              -0.07701327 

According to the formula above:

y <- co.data[1] + co.data[2]*x.pl + co.data[3] * x.sw + co.data[4]*x.pl*x.sw
# [1] 5.379006 5.495907 5.296913 4.382487 5.131850

Rather than writing it out manually you can do something like:

# x is a matrix with columns 1, petal length, sepal width, pl*sw
# (matches order of co.data)
x <- cbind(1, matrix(c(x.pl, x.sw, x.pl*x.sw), ncol=3))
x %*% co.data
#          [,1]
# [1,] 5.379006
# [2,] 5.495907
# [3,] 5.296913
# [4,] 4.382487
# [5,] 5.131850

Upvotes: 18

Related Questions