Reputation: 2663
I am regressing a number of factor variables on a continuous outcome variable using lm()
. For example,
fit<-lm(dv~factor(hour)+factor(weekday)+factor(month)+factor(year)+count, data=df)
I would like to generate predicted values (yhat
) for different levels of a factor variable while holding the other variables at their median or modal value. For example, how would I generate the yhat
for different weekdays while holding other factors constant?
Upvotes: 3
Views: 4212
Reputation: 2144
I may be able to assist based on @Roland's comments. I think you want plain old ANOVA, which helps determine if factors are important or not. There's no need to factor here, integers or numbers (class: numeric) work fine. I put together the following code as example:
#creates df
(df <- data.frame(h=c(1,3,4,0,2, 3),d=c(2*1:3), m=c(-1, 0, 3, 4, 7, 8), y=c(30,28,27,26,22, 21)))
#creates linear model, gives output
(fit<-lm(df$d~ df$h + df$m+ df$y))
#runs ANOVA on linear model
anova(fit)
#creates predictions from lm based on different values of df$h
predict.lm(fit)
ANOVA is a special case of a regression. The output will tell you whether or not the factor is significant by the P value.
> anova(fit)
Analysis of Variance Table
Response: df$d
Df Sum Sq Mean Sq F value Pr(>F)
df$h 1 13.2923 13.2923 89.5846 0.01098 *
df$m 1 2.2832 2.2832 15.3879 0.05927 .
df$y 1 0.1277 0.1277 0.8608 0.45147
Residuals 2 0.2968 0.1484
In this example hours are very highly correlated with your dependent variable days, while months shows the next highest correlation.
Please see the link for a background-
http://www.cookbook-r.com/Statistical_analysis/ANOVA/
FYI - I recommend you include some source code to create your example. In this manner people who attempt to answer your question can all refer to the same example.
FYI2 - I recommend you add the tag "regression"
HTH.
Upvotes: 1