Multiple Regression in R: effect of covariable on factor or the contrary?

Question

I understand that this might a very simple question for people with a background in statistics. And yet, I can't find a clear answer suited to my (not so) particular case :(

I have a regression model with two categorical predictors (A, with two levels A1 and A2 and B, with two levels B1 and B2), and a numeric predictor Z.

I'm interested in the interaction between Z and B, but specifically in one level of A (A1, my reference level).

As such, I fitted the following model:

lm(Y ~ A/B*Z, data=df)

This returns me the following:

Fixed effects:
              Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)   -0.80420    0.02185 3160.00000 -36.811  < 2e-16 ***
A2             1.55943    0.02968 3160.00000  52.541  < 2e-16 ***
Z              0.07688    0.02561 3160.00000   3.002 0.002706 ** 
A1:B2          0.12481    0.03413 3160.00000   3.657 0.000259 ***
A2:B2         -0.09231    0.03500 3160.00000  -2.637 0.008397 ** 
A2:Z           0.05906    0.03072 3160.00000   1.923 0.054590 .  
A1:B2:Z       -0.06872    0.03959 3160.00000  -1.736 0.042668 *  
A2:B2:Z        0.01222    0.03385 3160.00000   0.361 0.718208

I believe that the 3 lines of particular interest, for me, are:

Z              0.07688    0.02561 3160.00000   3.002 0.002706 ** 
A1:B2          0.12481    0.03413 3160.00000   3.657 0.000259 ***
A1:B2:Z       -0.06872    0.03959 3160.00000  -1.736 0.042668 *

As far as I understood, the first line represents the linear relationship between my outcome Y and the linear predictor Z, in my reference level A1B1 (and that this is not a "main effect"). Thus, we can say that there is a positive linear relationship between these two.

Also, going from A1B1 to A1B2 (the second line) results in a significant increase of Y.

The third line cristalizes my issue. Does it say that:

For each increase of Z, the effect of "B2" (the second line) decreases (and is approx. 0.12-0.06=0.6)?
Or that whithin B2, the effect of Z (the first line) is significantly smaller (and is approx. 0.07-0.06=0.1)?

What is even more confusing for me is that if I swap those variables in the formula, the result is identical:

lm(Y ~ A/Z*B, data=df)

Fixed effects:
              Estimate Std. Error         df t value Pr(>|t|)    
...
B2             0.12481    0.03413 3160.00000   3.657 0.000259 ***
A1:Z           0.07688    0.02561 3160.00000   3.002 0.002706 ** 
...
A1:Z:B2       -0.06872    0.03959 3160.00000  -1.736 0.082668 .  
...

Bastien · Accepted Answer

Actually, if I understand well your question, you should interpret the line A1:B2:Z of your results as: when A==A1 and B==B1, then an increase of Z of 1 make Y be reduce by 0.06872.

It's important to understand that this reduction is added to the effect of Zalone (0.07688). The interaction A2:Z (-0.09231) isn't important in that case as A!=A1. Because of the interaction, the effect of Z is changing depending of the value taken my your variable A and B.

What you have here is a linear model like that:

Y = int + A2 + Z + A1:B2 + A2:B2 + A2:Z + A1:B2:Z + A2:B2:Z


Y = -0.80420 + 1.55943*A2 + 1.55943*Z + 0.12481*A1:B2 - 0.09231*A2:B2 + 0.05906*A2:Z - 0.06872*A1:B2:Z + 0.01222*A2:B2:Z

Where A1,A2,B1 and B2 are variables that take 0 or 1 depending or the value for the specific observation. Replace all those variables by their proper value for what your are looking for and solve the equation. For the effect of B:

Y = -0.80420 + 1.55943*0 + 1.55943*Z + 0.12481*1*B2 - 0.09231*0*B2 + 0.05906*0*:Z - 0.06872*1*B2:Z + 0.01222*0*B2:Z 
Y = -0.80420 + 1.55943*Z + 0.12481*B2 + 0.06872*B2:Z

so if B==B1, then B2==0

Y = -0.80420 + 1.55943*Z + 0.12481*0 + 0.06872*0:Z
Y = -0.80420 + 1.55943*Z

so a increase of 1.55943 of Y with a increase of one Z

If if B==B2, then B2==1

Y = -0.80420 + 1.55943*Z + 0.12481*1 + 0.06872*1:Z
Y = 0.92901 + 1.62815*Z

so a increase of 1.62815 of Y with a increase of one Z

It is interpreted as : The effect of Z is more important when B==B2.

Note: Changing the order of you model shouldn't have any effect.

Multiple Regression in R: effect of covariable on factor or the contrary?

Answers (1)

Related Questions