asia jacobs
asia jacobs

Reputation: 21

Making predictions with a set of betas in R

Suppose I have my posterior draws of Beta using MCMC. For each row, I have a set of betas. Is it possible to turn this row of betas into a model object in R so i can use the predict() function? Specifically, some of the betas are for categorical random variables so If I wanted to apply the betas manually it'll be difficult.

I think to do it manually I will have to turn each categorical variable into multiple columns of indicator variables.

Upvotes: 2

Views: 388

Answers (1)

Spacedman
Spacedman

Reputation: 94212

You can do this by constructing a model matrix and multiplying it by a coefficient vector.

Suppose this is our model formula. Species is a categorical variable:

> m = Sepal.Length~Petal.Length+Petal.Width+Species

Now create a model matrix from that and the data:

> mm = model.matrix(m, data=iris)
> head(mm)
  (Intercept) Petal.Length Petal.Width Speciesversicolor Speciesvirginica
1           1          1.4         0.2                 0                0
2           1          1.4         0.2                 0                0
3           1          1.3         0.2                 0                0
4           1          1.5         0.2                 0                0
5           1          1.4         0.2                 0                0
6           1          1.7         0.4                 0                0

The Species has three levels, and you can see that all of the first few rows have zeroes for the two species columns, ergo they are of the other species.

To do predictions for some set of 5 coefficients, they need to be in the same order as the rows and with the same "constrast", ie the encoding of species is done the same way with two dummy variables. So your "beta" values have to look the same as the coefficients you'd get from lm on the data:

> mfit = lm(m, data=iris)
> mfit$coefficients
      (Intercept)      Petal.Length       Petal.Width Speciesversicolor 
      3.682982011       0.905945867      -0.005995401      -1.598361502 
 Speciesvirginica 
     -2.112646781 

Now you can get predictions for any set of coefficients by matrix multiplication:

> head(mm %*% mfit$coefficients)
      [,1]
1 4.950107
2 4.950107
3 4.859513
4 5.040702
5 4.950107
6 5.220692

If I've done that right, those values should equal the output from predict:

> head(predict(mfit))
       1        2        3        4        5        6 
4.950107 4.950107 4.859513 5.040702 4.950107 5.220692 

and if your coefficients are different you should get different predictions via matrix multiplying:

> head(mm %*% c(0,1,.2,.2,.4))
  [,1]
1 1.44
2 1.44
3 1.34
4 1.54
5 1.44
6 1.78

Upvotes: 2

Related Questions