WCMC
WCMC

Reputation: 1791

multinomial logistic regression in R: multinom in nnet package result different from mlogit in mlogit package?

Both R functions, multinom (package nnet) and mlogit (package mlogit) can be used for multinomial logistic regression. But why this example returns different result of p values of coefficients?

#prepare data

mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
mydata$rank <- factor(mydata$rank)
mydata$gre[1:10] = rnorm(10,mean=80000)

#multinom:

test = multinom(admit ~ gre + gpa + rank, data = mydata)
z <- summary(test)$coefficients/summary(test)$standard.errors
# For simplicity, use z-test to approximate t test.
pv <- (1 - pnorm(abs(z)))*2 
pv
# (Intercept)         gre         gpa       rank2       rank3       rank4 
# 0.00000000  0.04640089  0.00000000  0.00000000  0.00000000  0.00000000 

#mlogit:

mldata = mlogit.data(mydata,choice = 'admit', shape = "wide")

mlogit.model1 <- mlogit(admit ~ 1 | gre + gpa + rank, data = mldata)
summary(mlogit.model1)
# Coefficients :
#   Estimate  Std. Error t-value  Pr(>|t|)    
# 1:(intercept) -3.5826e+00  1.1135e+00 -3.2175 0.0012930 ** 
#   1:gre          1.7353e-05  8.7528e-06  1.9825 0.0474225 *  
#   1:gpa          1.0727e+00  3.1371e-01  3.4195 0.0006274 ***
#   1:rank2       -6.7122e-01  3.1574e-01 -2.1258 0.0335180 *  
#   1:rank3       -1.4014e+00  3.4435e-01 -4.0697 4.707e-05 ***
#   1:rank4       -1.6066e+00  4.1749e-01 -3.8482 0.0001190 ***

Why the p values from multinorm and mlogit are so different? I guess it is because of the outliers I added using mydata$gre[1:10] = rnorm(10,mean=80000). If outlier is an inevitable issue (for example in genomics, metabolomics, etc.), which R function should I use?

Upvotes: 7

Views: 7298

Answers (2)

patL
patL

Reputation: 2299

As alternative, you can use broom, which outputs tidy format for multinom class models.

library(broom)

tidy(test)

It'll return a data.frame with z-statistics and p-values. Take a look at tidy documentation for further information.


P.S.: as I can't get the data from the link you posted, I can't replicate the results

Upvotes: 4

nothing
nothing

Reputation: 3290

The difference here is the difference between the Wald $z$ test (what you calculated in pv) and the Likelihood Ratio test (what is returned by summary(mlogit.model). The Wald test is computationally simpler, but in general has less desirable properties (e.g., its CIs are not scale-invariant). You can read more about the two procedures here.

To perform LR tests on your nnet model coefficents, you can load the car and lmtest packages and call Anova(test) (though you'll have to do a little more work for the single df tests).

Upvotes: 3

Related Questions