Mel
Mel

Reputation: 750

How to change contrasts to compare with mean of all levels rather than reference level (R, lmer)?

I have a dataset for which each row is one visit to a store by a salesperson and the fields include "outlet" (store ID), "devices" (how many electronic devices the salesperson sold) and "weekday" (the day of the week on which the salesperson was in the store).

I want to work out whether one weekday is better than the others for sales, so instead of comparing all the days of the week to e.g. Monday I want to compare them to the mean of all the days of the week. I am using the lmerTest function (lme4::lmer with estimated p-values) for this.

I have tried the following code:

data$weekday <- factor(weekday_sales$weekday, levels=c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))

contrasts(data$weekday) = contr.sum(7) 

summary(lmerTest::lmer(data=data, devices~weekday + (1|outlet)))

which gives:

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)   4.3681     0.6024  12.4472   7.251 8.24e-06 ***
weekday1     -1.0585     0.5129 145.7337  -2.064  0.04080 *  
weekday2     -0.2830     0.4958 142.3214  -0.571  0.56913    
weekday3      1.1884     0.4907 140.5545   2.422  0.01671 *  
weekday4      0.1100     0.5025 145.1407   0.219  0.82707    
weekday5      1.3589     0.5135 143.8204   2.646  0.00904 ** 
weekday6     -0.1629     0.5020 143.1605  -0.325  0.74600   

However there were all seven weekdays in the dataset (one is missing) and the levels of the weekdays in the dataset are stored as "Monday", "Tuesday", "Wednesday" etc. not as "weekday1", "weekday2" etc.

Why is there one weekday missing and how do I know which one this is? Does this compare each weekday to the mean or is it doing something else? (And if so how do I change the contrasts to compare all levels to the mean of all levels?)

Upvotes: 2

Views: 1242

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226761

You need to explicitly suppress the intercept:

devices~ -1 + weekday  + (1|outlet))

or

devices ~ 0 + weekday  + (1|outlet))

It's not particularly clear, but when you use sum-to-zero contrasts, the first parameter is (level 1 - mean), the second is (level 2 - mean), etc., so the comparison that's missing is the last level: "Sunday vs. mean".

set.seed(101)
w <- c("Monday", "Tuesday", "Wednesday", "Thursday", 
       "Friday", "Saturday", "Sunday")
dd <- data.frame(w=factor(rep(w,10),levels=w),y=rnorm(70))
m0 <- lm(y~w,dd, contrasts=list(w=contr.sum))
m1 <- lm(y~w-1,dd, contrasts=list(w=contr.sum))

Upvotes: 2

David_O
David_O

Reputation: 1153

The problem is that with sum contrasts, you can't compare all groups to the overall mean because they aren't independent. If you know the grand mean G and then the means of days 1 -6, then the mean of day 7 can be calculated from the values you already have. So basically, you can't do it using contrasts - you'd need a post-hoc test of some kind.

With the standard treatment contrasts, you still only make six comparisons (1-2, 1-3, 1-4, 1-5, 1-6, 1-7) and the usual question is: hey, where did 1 go. The answer there is that it is the intercept. Here, you have G-1, G-2, G-3, G-4, G-5, G-6 and then lose G-7.

Upvotes: 1

Related Questions