Reputation: 1
I need help with interpreting the output of a multiple linear regression with two categorical predictor variables and an interaction term.
I did the following linear regression:
lm(H1A1c ~ Vowel * Speaker, data=data)
Vowel and Speaker are both categorical variables. Vowel can be "breathy", "modal" or "creaky" and there are four different speakers (F01, F02, M01, M02)
. I want to see if a combination of those two categories can predict the values for H1A1c
.
My output is this: Output of lm
Please correct me if I am wrong but I think we can see from this output that the relationship between most of my variables can't be characterised as linear. What I don't really understand is how to interpret the first p-value. When I googled I found that all the other p-values
refer to the relationship of the respective coefficient and what this coefficient relates to. E.g. the p-value
in the third line refers to the relationship of the coefficient of the third line to the first one, i.e. 23.1182-9.6557
.
What about the p-value of the first coefficient, though? There can't be a linear relationship if there is no relationship? What does this p-value refer to?
Upvotes: -1
Views: 1038
Reputation: 1787
The first p-value(Intercept) tells you how likely the y-intercept of your fitted line is going to be zero(pass through the origin). Since the p-value in your result is way lower than 0.05, you can say the y-intercept is certainly not zero.
Other p-values are to be interpreted differently. Your interpretation is correct that they give an idea whether the coefficients of the variables they represent are likely to be zero or not.
the p-value in the third line refers to the relationship of the coefficient of the third line to the first one, i.e. 23.1182-9.6557
(-9.6557) means that on an average, the predicted value of H1A1c will be 9.6557 units lower if GlottalContext=creaky(i.e. GlottalContextcreaky = 1) compared to when GlottalContext=breathy(since breathy is your reference category here) keeping all other predictors unchanged. This is obviously when the corresponding p-value is less than 0.05 which, I see, is the case for GlottalContextcreaky.
(Additionally, if I were to assume that H1A1c is a continuous variable, I am not sure if choosing a linear regression to predict H1A1c would be the best way to go since both your predictors are categorical. You might want to explore other algorithms e.g. transform your dependent variable to categorical and do a binary/multinomial logistic regression or a decision tree)
Upvotes: 0