Reputation: 12178

R: predict.glm equivalent for MCMCpack::MCMClogit

I am running a Bayesian logit with MCMCpack::MCMClogit. The syntax is easy and follows lm() or glm(), but I can't find any equivalent of the predict.glm function. Is there any way of predicting the probabilities of the outcomes in MCMClogit for each unit of observation in the dataframe? predict() is especially useful for validating training data from new data, which is what I ultimately have to do.

df = read.csv("http://dl.dropbox.com/u/1791181/MCMC.csv")#Read in data
model.glm = glm(SECONDARY.LEVEL ~ AGE + SEX, data=df, family=binomial(link=logit))
glm.predict = predict(model.glm, type="response")

For MCMClogit():

model.mcmc = MCMClogit(SECONDARY.LEVEL ~ AGE + SEX, data=df, mcmc=1000)

Upvotes: 4

Answers (2)

Tim Goodman

Reputation: 23976

You could use the posterior distribution of model parameters produced by MCMC to get a distribution of predictions, using the logistic function.

For instance, if your model formula is y ~ x1 + x2 + x3, and your MCMC output is stored in the variable posterior.mcmc, then you could use

function(x1, x2, x3) 1 / (1 + exp(-posterior.mcmc %*% rbind(1, x1, x2, x3)))

to give the distribution analogous to predict.glm(., 'response')

More detailed example for the case of a single input variable:

library(extraDistr)
library(MCMCpack)

# Take x uniformly distributed between -100 and 100
x <- runif(2000, min=-100, max=100)

# Generate a response which is logistic with some noise
beta <- 1/8
eps <- rnorm(length(x), 0, 1)
p <- function(x, eps) 1 / (1 + exp(-beta*x + eps))
p.x <- p(x, eps)
y <- sapply(p.x, function(p) rbern(1, p))
df1 <- data.frame(x, y)

# Fit by logistic regression
glm.logistic <- glm(y ~ x, df1, family=binomial)

# MCMC gives a distribution of values for the model parameters
posterior.mcmc <- MCMClogit(y ~ x, df1, verbose=2000)
densplot(posterior.mcmc)

# Thus, we have a distribution of model predictions for each x
predict.p.mcmc <- function(x) 1 / (1 + exp(-posterior.mcmc %*% rbind(1,x)))
interval.p.mcmc <- function(x, low, high) apply(predict.p.mcmc(x), 2,
                                                function(x) quantile(x, c(low, high)))

predict.y.mcmc <- function(x) posterior.mcmc %*% rbind(1,x)
interval.y.mcmc <- function(x, low, high) apply(predict.y.mcmc(x), 2,
                                                function(x) quantile(x, c(low, high)))

## Plot the data and fits ##

plot(x, p.x, ylab = 'Pr(y=1)', pch = 20, cex = 0.5, main = 'Probability vs x')

# x-values for prediction
x_test <- seq(-100, 100, 0.01)

# Blue line is the logistic function we used to generate the data, with noise removed
p_of_x_test <- p(x_test, 0)
lines(x_test, p_of_x_test, col = 'blue')

# Green line is the prediction from logistic regression
lines(x_test, predict(glm.logistic, data.frame(x = x_test), 'response'), col = 'green')

# Red lines indicates the range of model predictions from MCMC
# (for each x, 95% of the distribution of model predictions lies between these bounds)  
interval.p.mcmc_95 <- interval.p.mcmc(x_test, 0.025, 0.975)
lines(x_test, interval.p.mcmc_95[1,], col = 'red')
lines(x_test, interval.p.mcmc_95[2,], col = 'red')

# Similarly for the log-odds
plot(x, log(p.x/(1 - p.x)), ylab = 'log[Pr(y=1) / (1 - Pr(y=1))]',
     pch = 20, cex = 0.5, main = 'Log-Odds vs x')

lines(x_test, log(p_of_x_test/(1 - p_of_x_test)), col = 'blue')

lines(x_test, predict(glm.logistic, data.frame(x = x_test)), col = 'green')

interval.y.mcmc_95 <- interval.y.mcmc(x_test, 0.025, 0.975)
lines(x_test, interval.y.mcmc_95[1,], col = 'red')
lines(x_test, interval.y.mcmc_95[2,], col = 'red')

Upvotes: 2

Pop

Reputation: 12411

The description of the function says :

This function generates a sample from the posterior distribution of a logistic regression model using a random walk Metropolis algorithm.

I think therefore that your model.mcmc already contains the points that MCMClogit() has simulated.

You can use str to see what it contains and summary and plot functions on it like in the example there : http://cran.r-project.org/web/packages/MCMCpack/MCMCpack.pdf

Upvotes: 0

R: predict.glm equivalent for MCMCpack::MCMClogit

Answers (2)

Related Questions