Reputation: 685
I'm constructing a linear model to evaluate the effect of distances from a habitat boundary on the richness of an order of insects. There are some differences in equipment used so I am including equipment as a categorical variable to ensure that it hasn't had a significant affect on richness.
The categorical factor is 3 leveled so I asked r to produced dummy variables in the lm by using the code:
lm(Richness ~ Distances + factor(Equipment), data = Data)
When I ask for the summary of the model I can see two of the levels with their coefficients. I am assuming that this means r is using one of the levels as the "standard" to compare the coefficients of the other levels to.
How can I find the coefficient for the third level in order to see what effect it has on the model?
Thank you
Upvotes: 1
Views: 3002
Reputation: 419
To determine how to extract your coefficient, here is a simple example:
# load data
data(mtcars)
head(mtcars)
# what are the means of wt given the factor carb?
(means <- with(mtcars, tapply(wt, factor(carb), mean)))
# run the lm
mod <- with(mtcars, lm(wt~factor(carb)))
# extract the coefficients
coef(mod)
# the intercept is the reference level (i.e., carb 1)
coef(mod)[1]
coef(mod)[2:6]
coef(mod)[1] + coef(mod)[2:6]
means
So you can see that the coefficients are simply added to the reference level (i.e., intercept) in this simple case. However, if you have a covariate, it gets more complicated
mod2 <- lm(wt ~ factor(carb) + disp, data=mtcars)
summary(mod2)
The intercept is now the carb 1 when disp = 0.
Upvotes: 1
Reputation: 4797
You can do lm(y~x-1)
to remove the intercept, which in your case is the reference level of one of the factors. That being said, there are statistical reasons for using one of the levels as a reference.
Upvotes: 2