Reputation: 21
Suppose I have my posterior draws of Beta using MCMC. For each row, I have a set of betas. Is it possible to turn this row of betas into a model object in R so i can use the predict() function? Specifically, some of the betas are for categorical random variables so If I wanted to apply the betas manually it'll be difficult.
I think to do it manually I will have to turn each categorical variable into multiple columns of indicator variables.
Upvotes: 2
Views: 388
Reputation: 94212
You can do this by constructing a model matrix and multiplying it by a coefficient vector.
Suppose this is our model formula. Species
is a categorical variable:
> m = Sepal.Length~Petal.Length+Petal.Width+Species
Now create a model matrix from that and the data:
> mm = model.matrix(m, data=iris)
> head(mm)
(Intercept) Petal.Length Petal.Width Speciesversicolor Speciesvirginica
1 1 1.4 0.2 0 0
2 1 1.4 0.2 0 0
3 1 1.3 0.2 0 0
4 1 1.5 0.2 0 0
5 1 1.4 0.2 0 0
6 1 1.7 0.4 0 0
The Species
has three levels, and you can see that all of the first few rows have zeroes for the two species columns, ergo they are of the other species.
To do predictions for some set of 5 coefficients, they need to be in the same order as the rows and with the same "constrast", ie the encoding of species is done the same way with two dummy variables. So your "beta" values have to look the same as the coefficients you'd get from lm
on the data:
> mfit = lm(m, data=iris)
> mfit$coefficients
(Intercept) Petal.Length Petal.Width Speciesversicolor
3.682982011 0.905945867 -0.005995401 -1.598361502
Speciesvirginica
-2.112646781
Now you can get predictions for any set of coefficients by matrix multiplication:
> head(mm %*% mfit$coefficients)
[,1]
1 4.950107
2 4.950107
3 4.859513
4 5.040702
5 4.950107
6 5.220692
If I've done that right, those values should equal the output from predict
:
> head(predict(mfit))
1 2 3 4 5 6
4.950107 4.950107 4.859513 5.040702 4.950107 5.220692
and if your coefficients are different you should get different predictions via matrix multiplying:
> head(mm %*% c(0,1,.2,.2,.4))
[,1]
1 1.44
2 1.44
3 1.34
4 1.54
5 1.44
6 1.78
Upvotes: 2