Adela
Adela

Reputation: 1797

mirt - odd results for nominal model

Using mirt package I obtained (possibly) odd results for my nominal model.

 library(difNLR)
 library(mirt)
 data("GMATtest", "GMATkey")
 key <- as.numeric(as.factor(GMATkey))
 data <- sapply(1:20, function(i) as.numeric(GMATtest[, i]))
 colnames(data) <- paste("Item", 1:ncol(data))
 scoredGMAT <- key2binary(data, key)

 # 2PL IRT model for scored data
 mod0 <- mirt(scoredGMAT, 1)
 # nominal model for unscored data
 mod1 <- mirt(data, 1, 'nominal')

# plots of characteristic curves for item 1
itemplot(mod0, 1)
itemplot(mod1, 1)

enter image description hereenter image description here

I expected that for the nominal model mod1 there will be one curve very similar to the correct answer as plotted for my mod0. However, it seems that distractors have increasing probability with increasing theta, which seems not really reasonable. Of course, there can be something wrong with data or (more probably) I'm missing something..

I have already checked examples in mirt help and results are as I expected.

Any suggestions (what may be wrong) would be appreciated!

One last thing - I also tried to fit 2PLNRM model but my R session aborted. Anybody noticed same issue? My code:

# 2PLNRM model
mod2 <- mirt(data, 1, "2PLNRM", key = key)
coef(mod2)$`Item 1`
itemplot(mod2, 1)

EDIT: There is an example from mirt package:

library(mirt)
data(SAT12)
SAT12[SAT12 == 8] <- NA #set 8 as a missing value
head(SAT12)

# correct answer key
key <- c(1, 4, 5, 2, 3, 1, 2, 1, 3, 1, 2, 4, 2, 1, 5, 3, 4, 4, 1, 4, 3, 
         3, 4, 1, 3, 5, 1, 3, 1, 5, 4, 5)
scoredSAT12 <- key2binary(SAT12, key)
mod0 <- mirt(scoredSAT12, 1)

# for first 5 items use 2PLNRM and nominal
scoredSAT12[, 1:5] <- as.matrix(SAT12[, 1:5])
mod1 <- mirt(scoredSAT12, 1, c(rep('nominal', 5), rep('2PL', 27)))

coef(mod0)$Item.1
coef(mod1)$Item.1

itemplot(mod0, 1)
itemplot(mod1, 1)

And the results are what I expected, however, when I try to fit nominal model for all items, curves changed:

# nominal for all items
mod1 <- mirt(SAT12, 1, 'nominal')
coef(mod1)$Item.1
itemplot(mod1, 1)

So, as you suggested, it seems that theta and its interpretation changed, but why and how?

Upvotes: 1

Views: 749

Answers (2)

philchalmers
philchalmers

Reputation: 747

@Juan Bosco is correct that this behaviour is consistent. The issue with using the nominal response model for all items is that the direction of an increasing $\theta$ value is not obvious in the model because it's direction is arbitrary (the items are 'unordered' by default, after all).

Moreover, because of mirt's default parameterisation, which assumes that the lowest/highest numerical category should be associated with low/high $\theta$ values, this type of flipping is common in multiple choice-type items (where, unlike rating scale ordered data, there should be no direct relationship) because the model will pick the orientation that best matches with these identification constraints.

To fix this, simply redefine the scoring constraints used by mirt by replacing the highest fixed scoring coefficient to the actual scoring key provided. Like so:

#starting values data.frame
sv <- mirt(data, 1, 'nominal', pars = 'values')
head(sv)

# set all values to 0 and estimated
sv$value[grepl('ak', sv$name)] <- 0
sv$est[grepl('ak', sv$name)] <- TRUE

nms <- colnames(data)
for(i in 1:length(nms)){

    #set highest category based on key fixed to 3
    pick <- paste0('ak', key[i]-1)
    index <- sv$item == nms[i] & pick == sv$name
    sv[index, 'value'] <- 3
    sv[index, 'est'] <- FALSE

    # set arbitrary lowest category fixed at 0
    if(pick == 'ak0') pick2 <- 'ak3'
    else pick2 <- paste0('ak', key[i]-2)
    index2 <- sv$item == nms[i] & pick2 == sv$name
    sv[index2, 'est'] <- FALSE
}

#estimate
mod2 <- mirt(data, 1, 'nominal', pars=sv)
plot(mod2, type = 'trace')
itemplot(mod2, 1)
coef(mod2, simplify=TRUE)

At the very least, this informs the model which category is the highest, and therefore provides enough information to finish with a more appropriate orientation. Note that it really doesn't affect the interpretation of the model per say, because all that happens is the slopes are multiplied by -1 and the scoring coefs are adjusted accordingly. HTH.

Upvotes: 1

Adela
Adela

Reputation: 1797

Well, as suggested by Juan, the problem is that estimate of theta is changed when using different IRT model. Moreover there is some connection between the estimates by 2PL and nominal model.

library(difNLR)
library(mirt)
data("GMATtest", "GMATkey")
key <- as.numeric(as.factor(GMATkey))
data <- sapply(1:20, function(i) as.numeric(GMATtest[, i]))
colnames(data) <- paste("Item", 1:ncol(data))
scoredGMAT <- key2binary(data, key)

# 2PL IRT model for scored data
mod0 <- mirt(scoredGMAT, 1)
# nominal model for unscored data
mod1_all <- mirt(data, 1, 'nominal')
# nominal model for only first item
df <- data.frame(data[, 1], scoredGMAT[, 2:20])
mod1_1 <- mirt(df, 1, c('nominal', rep('2PL', 19)))

# plots of characteristic curves for item 1
itemplot(mod0, 1)
itemplot(mod1_all, 1)
itemplot(mod1_1, 1)

# factor scores
fs0 <- fscores(mod0)
fs1_all <- fscores(mod1_all)
fs1_1 <- fscores(mod1_1)


plot(fs1_all ~ fs0)
plot(fs1_1 ~ fs0)

enter image description here

# linear model
round(coef(lm(fs1_all ~ fs0)), 4)

(Intercept)         fs0 
    -0.0001     -0.9972 

This seems that new theta is sth like 'ignoration' rather than 'knowledge', as it's almost minus original theta.

Thank you Juan for your ideas, they were really helpful!

Upvotes: 0

Related Questions