Reputation: 1584
Recently, I had to work with R and SPSS to analyze a dataset with Multinomial Regression framework. We surveyed some participants (10-12 years old) and we asked which "professional field" they like the most, then we asked how often they accessed the internet. So, the outcome is a Categorical variable": professional field -- "military", "I don't know", and "Other profession"; and the independent variable is also a categorical variable (how often do you access the internet ("I don't have access", "1-3 hours/day", "3-5 hours/day").
I ran a model using R (with nnet package, via multinom function), and other statistician ran using SPSS. All reference category was defined correctly.
Now, when we compare the results, they don't agree for the second category of my independent variable. The first one is ok.
Please have a look at the entire code:
library(tidyverse)
library(stargazer)
library(nnet)
ds <- ds %>% mutate(internet = factor(internet))
ds <- ds %>% mutate(internet = relevel(internet, ref = "I dont have internet access"))
ds <- ds %>% mutate(field = factor(field))
ds <- ds %>% mutate(fielf = relevel(field, ref = "I dont know"))
mod <- multinom(field ~ internet, data = ds, maxit=1000, reltol = 1.0e-9)
stargazer(mod, type = 'text')
For the sake of clarity, when the independent variable has only two categories (such as sex, male and female), both R and SPSS agree with its results
After a huge effort trying to understand the discrepancy between both results, I read that nnet estimation could have some problems (optimization problem ?), and that the discrepancy of results is not so strange as I was thinking at the beginning..
Can someone explain to me what's going on here? What am I missing?! I assume SPSS and R must have the same results if we are running the same model.
Thank you
That's the ds I'm using in this example:
ds <- structure(list(sex = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 2L,
2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 1L), .Label = c("male", "female"), class = "factor"), internet = structure(c(3L,
3L, 2L, 3L, 2L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 3L, 2L,
2L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 2L, 2L, 3L, 3L, 3L,
2L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 2L, 2L, 3L, 3L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 1L, 2L, 3L, 1L, 2L, 2L, 2L, 3L, 3L, 2L,
2L, 1L, 3L, 2L, 2L, 3L, 2L, 2L), .Label = c("I dont have internet access",
"1-3 hours/day", "3-5 hours/day"), class = "factor"), field = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("I dont know", "Military",
"Other profession"), class = "factor")), class = "data.frame", row.names = c(NA,
-73L))
Upvotes: 3
Views: 548
Reputation: 73612
You could use mlogit
alternatively which resembles the SPSS results more closely. The SPSS values should be valid, since Stata yields similar results (-14.88 (982.95), 11.58 (982.95), 11.44 (982.95)
). The remaining deviations might stem from the ridiculous significance of "other profession".
library(mlogit)
ml.dat <- mlogit.data(ds, choice="field", shape="wide")
ml <- mlogit(field ~ 1 | internet, data=ml.dat)
Yielding
texreg::screenreg(ml)
=========================================================
Model 1
---------------------------------------------------------
Military:(intercept) -0.41
(0.91)
Other profession:(intercept) -16.89
(2690.89)
Military:factor(internet)1-3 hours/day -1.50
(1.06)
Other profession:factor(internet)1-3 hours/day 13.60
(2690.89)
Military:factor(internet)3-5 hours/day -1.64
(1.06)
Other profession:factor(internet)3-5 hours/day 13.46
(2690.89)
---------------------------------------------------------
AIC 85.49
Log Likelihood -36.74
Num. obs. 73
=========================================================
*** p < 0.001, ** p < 0.01, * p < 0.05
Upvotes: 0