PerkinsN
PerkinsN

Reputation: 49

warning message ordinal logistic regression in R

This is my data head(both):

 season  gender age   prog     grade 
    fall    woman  old  FRIST       B
    fall    woman  old  FRIST       A 
    spring  woman  old  FRIST       E 
    spring    man  old  NMATK       C 
    spring  woman  old  NFYSK       A 
    fall    woman  old  FRIST       E 

I want to do logistic regression where grades are response variable. I want to make four of which are independent.

Here:

E/A+B+C+D=alpha_1+beta^x_1+beta^y_1+...

D+E/A+B+C=alpha_2+beta^x_2+beta^y_2+...

C+D+E/A+B=alpha_3+beta^x_3+beta^y_3+...

B+C+D+E/A=alpha_4+beta^x_4+beta^y_4+...

What i have done:

    library(MASS)
y <- factor(both$betyg)
mod.fit <- polr(y ~ prog + gender + age + season, data=both, Hess=TRUE)
summary(mod.fit) 

Then I get this message :

Warning message: In polr(y ~ prog + gender + age + season, data = both, Hess = TRUE) : design appears to be rank-deficient, so dropping some coefs

I know this is not an error just a warning. I Do not know how to interpret it or what to do differently to avoid this message?

Upvotes: 1

Views: 4667

Answers (1)

Max Ghenis
Max Ghenis

Reputation: 15773

Since your outcome is ordered, you'll probably do better with ordinal, but may want to check the proportional odds assumption. The model you're describing is pretty much what polr does, though they're not independent as you say. UCLA has a good tutorial on this.

As for determining which model is best, when dealing with fundamentally different types of models like these, I'd recommend cross-validation. Prediction accuracy doesn't lie, and any pseudo-R^2 metrics will differ in interpretation across models.

Also, since this question concerns statistics more than R coding/implementation, I'd recommend CrossValidated (the stats StackExchange site).

Upvotes: 1

Related Questions