Reputation: 1039
I have a three factor contigency table that explores the association between committed crimes, Shoplifting or other theft acts here, gender and prior convictions on the one hand and lenient setences on the other. Lenient senteces is the response variable here and is binary ,1 for receiving a lenient sentence, 0 otherwise.
Crime Gender Priorconv Yes No
1 Shoplifting Men N 24 1
2 Other Theft Acts Men N 52 9
3 Shoplifting Women N 48 3
4 Other Theft Acts Women N 22 2
5 Shoplifting Men P 17 6
6 Other Theft Acts Men P 60 34
7 Shoplifting Women P 15 6
8 Other Theft Acts Women P 4 3
You can recreate the table using these commands
table1<-expand.grid(Crime=factor(c("Shoplifting","Other Theft Acts")),Gender=factor(c("Men","Women")),
Priorconv=factor(c("N","P")))
table1<-data.frame(table1,Yes=c(24,52,48,22,17,60,15,4),No=c(1,9,3,2,6,34,6,3))
I have been trying to run a logistic regression but quickly ran into trouble when I tried to include interactions between my variables. The glm works perfectly without the interactions. The code I have been using is
fit<-glm(cbind(Yes,No)~Crime+Gender+Priorconv+I(Crime*Priorconv),data=table1,family=binomial)
and the error I have been getting
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
In addition: Warning message:
In Ops.factor(Crime, Priorconv) : * not meaningful for factors
Could you please tell how I could deal with this error?
Thank you
Upvotes: 2
Views: 432
Reputation: 226077
By specifying I(Crime*Priorconv)
you are asking R to compute the value Crime*Priorconv
, which it refuses to do (because it doesn't make sense to multiply factors). If Crime
and Priorconv
were already numeric dummy variables (e.g. 0/1 coding with 0=shoplifting, 1=other and 0=N, 1=P) then it would make sense to multiply them, and you would use the I()
notation to indicate that you wanted to multiply them.
Otherwise (if you don't use I()
), R will interpret *
as "interaction plus all lower-order effects", i.e. Crime*Priorconv
corresponds to 1+Crime+Priorconv+Crime:Priorconv
(where :
denotes the interaction). R would automatically handle the redundancies (i.e. the fact that you have already specified main effects of Crime
and Priorconv
): in a formula context, including redundant main effects and explicitly including the intercept (1
) or not are all equivalent. These formulae will all specify the same model:
1+Crime+Priorconv+Crime:Priorconv
Crime+Priorconv+Crime*Priorconv
Crime+Priorconv+Crime:Priorconv
Crime*Priorconv
but I prefer the last one: as @J.R. points out in his answer you can take advantage of the *
notation to express your model more compactly.
Upvotes: 5
Reputation: 3878
You can use x:y
in the formula to specify interactions between x and y, eg.:
fit<-glm(cbind(Yes,No)~Crime+Gender+Priorconv+Crime:Priorconv,data=table1,family=binomial)
or a little shorter:
fit<-glm(cbind(Yes,No)~Gender+Crime*Priorconv,data=table1,family=binomial)
Upvotes: 3