Reputation: 23
I am using the BradleyTerry2 package in R to analyse my data. When using the BTm function to calculate ability scores, the first item in the dataset is removed as a reference, given a score of 0 and then other ability scores are calculated relative to this reference.
Is there a way to use a null hypothesis as a reference, rather than using the first item in the dataset?
This is the code I am using. The "ID" field is player id. This code calculates an ability score for each "Matchup," relative to the first matchup in the dataset.
BTv1 <- BTm(player1=winner,player2=loser,id="ID",formula=~Matchup+(1|ID),data=btmdata)
I am trying to test against the null hypothesis that matchup has no effect on match outcomes, but currently I don't know what ability score corresponds to the null hypothesis. I would like to use this null hypothesis as a reference, rather than using the first matchup in the dataset.
For those wanting to reproduce my results, you can find my files on my university onedrive.
Upvotes: 2
Views: 144
Reputation: 3314
You can test the significance of terms in the model for ability using the anova
function, i.e.
anova(BTv1, test = "Chisq")
Using the example data and script that you shared, we get the following result:
Sequential Wald Tests
Model: binomial, link: logit
Response: NULL
Predictor: ~Characters + (1 | ID)
Terms added sequentially (first to last)
Statistic Df P(>|Chi|)
NULL
Characters 46.116 26 0.008853 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Edit: For the model BTv2
with log-ability modelled by ~ Matchup+(1|ID)
Before investigating individual matchups, we should check the significance of the term overall. Unfortunately the anova()
method for BTm objects doesn't currently work for terms with inestimable parameters, as in this case. So we'll compute this directly:
cf <- coef(BTv2)[!is.na(coef(BTv2))]
V <- vcov(BTv2)
ind <- grep("Matchup", names(cf))
chisq <- c(t(cf[ind]) %*% chol2inv(chol(V[ind, ind])) %*% cf[ind])
df <- length(ind)
c(chisq = chisq, df = df)
# chisq df
# 107.5667 167.0000
The Chi-squared statistic is less than the degrees of freedom, so the Matchup term is not significant - the model is over-fitting and it's not a good idea to investigate matchup-specific effects.
All the same, let's look at the model when fitted to the matches involving just 3 of the characters, for illustration.
summary(BTv2)$fixef
# Estimate Std. Error z value Pr(>|z|)
# MatchupCaptainFalcon;Falco -0.1327177 0.3161729 -0.4197632 0.6746585
# MatchupCaptainFalcon;Peach 0.1464518 0.3861823 0.3792297 0.7045173
# MatchupFalco;Peach -0.4103029 0.3365761 -1.2190496 0.2228254
In this case only 3 parameters are estimable, the rest are fixed to zero. Under model BTv2 for players i and j playing characters c and d respectively, we have
logit(p(i playing c beats j playing d))
= log_ability_i - log_ability_j + U_i - U_j
= Matchup_{c;d} - Matchup_{d;c} + U_i - U_j
where U_i and U_j are random player effects. So for players of the same baseline ability we have for example,
logit(p(CaptainFalcon beats Falco)) = -0.1327177 - 0 = -0.1327177
logit(p(Falco beats CaptainFalcon)) = 0 - (-0.1327177) = 0.1327177
So this tells you whether one character is favoured over another in a particular pairwise matchup.
Let's return to the BTv1 model based on all the data. In this model, for players of the same baseline ability we have
logit(p(i playing c beats j playing d)) = log_ability_i - log_ability_j
= Characters_c - Characters_d
The effect for "CharactersBowser" is set to zero, the rest are estimable. So e.g.
summary(BTv1)$fixef[c("CharactersFalco", "CharactersPeach"),]
# Estimate Std. Error z value Pr(>|z|)
# CharactersFalco 2.038925 0.9576332 2.129130 0.03324354
# CharactersPeach 2.119304 0.9508804 2.228781 0.02582845
means that
logit(p(Bowser beats Peach)) = 0 - 2.119304 = -2.119304
logit(p(Falcon beats Peach)) = 2.038925 - 2.119304 = -0.080379
So we can still compare characters in a particular matchup. We can use quasi-variances to compare the character effects
# add in character with fixed effect set to zero (Bowser)
V <- cbind(XCharactersBowser = 0, rbind(XCharactersBowser = 0,
vcov(BTv1)))
cf <- c(CharactersBowser = 0, coef(BTv1))
# compute quasi-variances
qv <- qvcalc(V, "XCharacters", estimates = cf,
labels = sub("Characters", "", names(cf)))
# plot and compare
# (need to set ylim because some estimates are essentially infinite)
par(mar = c(7, 4, 3, 1))
plot(qv, ylim = c(-5, 5))
See e.g. https://doi.org/10.1093/biomet/91.1.65 for more on quasi-variances.
Upvotes: 3