Reputation: 6277
I'm using the svyolr
function from the survey
package to run an ordinal regression model. I'd like to analyze the model by inspecting the predictions it would make for my survey data and for hypothetical sets of data.
Since there doesn't seem to be a predict
method for the svyolr
function, how can I get model predictions from a fitted svyolr
model?
Here's some example data and a simple example model to work with:
library(survey)
# Load example data
data(api)
# Create a survey design object
dclus1 <- svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
# Add an ordinal variable to the data
dclus1 <- update(dclus1, mealcat=cut(meals,c(0,25,50,75,100)))
# Fit an ordinal regression model
m <- svyolr(formula = mealcat ~ avg.ed + stype,
design = dclus1)
# Generate fake data to use in evaluating model predictions
fake_data <- dclus1$variables[,c("avg.ed", "stype")]
fake_data <- subset(fake_data, !is.na(avg.ed))
Upvotes: 0
Views: 928
Reputation: 6104
svyolr is based on MASS::polr so i guess you could hack a prediction algorithm for that..
i doubt the confidence intervals that glm.predict::polr.predict() generates are legitimate
library(glm.predict)
library(survey)
data = MASS::survey
data$Smoke = ordered(MASS::survey$Smoke,levels=c("Never","Occas","Regul","Heavy"))
model1 = MASS::polr(Smoke ~ Sex + Height, data=data)
summary(model1)
# comparing a man to a woman
polr.predict(model1, c(1,170),sim.count=10000)
this_design <- svydesign( ~ 1 , data = data , weights = ~ Age )
model2 = svyolr(Smoke ~ Sex + Height, this_design)
class(model2) <- 'polr'
# prediction for male with height 170
( this_prediction <- polr.predict(model2, c(1,170),sim.count=10000) )
fake_data <- data[ c( 'Sex' , 'Height' ) ]
fake_data <- fake_data[ complete.cases( fake_data ) , ]
fake_data[ , ] <- sapply( fake_data[ , ] , as.numeric )
# lowering sim.count just so it runs faster
res <- apply( fake_data , 1 , function( w ) polr.predict(model2, w ,sim.count=100) )
# prediction for the first record
res[1:4,1]
# lower CI for the first record
res[5:8,1]
# upper CI for the first record
res[9:12,1]
# prediction of heavy smoking for every record
fake_data[ , 'predicted_heavy_smoking' ] <- res[ 4 , ]
Upvotes: 1