How does emmeans calculate confidence intervals used to compare means

Question

I'm looking for more background and documentation on how emmeans calculates confidence intervals used in the graphical comparison of means outlined in the following vignette: https://cran.r-project.org/web/packages/emmeans/vignettes/comparisons.html#graphical

In the "Graphical Comparisons" section there is an example of what I am referring to. In particular, I am interested in the red lines with arrows used for comparing means.

It reads:

If an arrow from one mean overlaps an arrow from another group, the difference is not “significant”.

But how are the red line intervals used for comparing means calculated? Is this documented somewhere?

Russ Lenth · Accepted Answer

I agree this is not sufficiently documented, and the code is pretty much a bunch of spaghetti. But I'll try to explain.

First, these comparison arrows are decidedly not confidence intervals. Confidence intervals for the means are provided by a separate option. But the comparison arrows are based on the confidence intervals for the pairwise differences of means.

Let the means be denoted m_1, m_2, ..., m_k, and let d_ij = m_i - m_j denote the difference between the ith and jth mean. Then the (1 - alpha) confidence interval for the true difference is (d_ij - e_ij, d_ij + e_ij), where e_ij is the margin of error for the difference; i.e., e_ij = t_alpha/2 * SE(d_ij). So, supposing that m_i > m_j so that d_ij > 0, d_ij is statistically significant if d_ij > e_ij.

Now, how to get the comparison arrows. Those are plotted with origins at the m_i; we have an arrow of length L_i pointing to the left from m_i, and an arrow of length R_i pointing to the right from m_i. To compare means m_i and m_j, and suppose m_i > m_j, we propose to look to see if the arrows extending left from m_i and right from m_j overlap. So, ideally, we want

L_i + R_j = e_ij   for all i, j such that m_i > m_j

If we can do that, then the two arrows will overlap if, and only if, d_ij < e_ij.

This is easy to accomplish if all the e_ij are equal: just set all L_i = R_j = e_12/2. But with different e_ij values, it may or may not even be possible. The code in emmeans uses a weighted regression method to solve the above equations. We give greater weight when d_ij is close to e_ij, because those are the cases where it is more critical that we get the lengths of the arrows right. And we have to test to make sure that L_i + R_j < d_ij when the difference is significant, and >= d_ij when it is not.

That's the essence of it. Note that there are additional complications to handle:

For the lowest value of m_i, L_i is completely arbitrary; in fact we don't even need to display that arrow. The same is true of R_j for the largest mean m_j. In fact, there could be additional unneeded arrows when two or more m_i are tied for the minimum or maximum values.
Depending on the number of means k and the number of tied minima and maxima, the system of equations could be under-determined, over-determined, or just right.
It's possible that the solution could result in some L_i or R_j being negative. That would be bad!

So, in summary, we try to do the best we can. The main reason for trying to do this is to enourage people to NOT EVER use confidence intervals for the m_i as a means of testing the comparisons d_ij. That is almost always incorrect. Don't ever confuse the margin of error for one mean with the margin of error for the difference of two means. Those are two different animals.

How does emmeans calculate confidence intervals used to compare means

Answers (1)

Related Questions