Raj Raina
Raj Raina

Reputation: 99

R: Store the results of a linear model directly to a data frame

Suppose I have a data frame that is:

df:
x|y
1|2
2|3
3|5
4|8

If I do

lm(y~x) 

I will get the linear model

y=-0.5+2*x. 

How can I store the estimates of this linear model directly to the data frame, so it looks something like

df:
x|y|estimate.from.LM
1|2|1.5
2|3|3.5
3|5|5.5
4|8|7.5

Of course, one way to do this is to simply make a new column by hand and assign it the direct value of the linear model, like

df$estimate.from.LM=-.5+2*df$x

Which is easy to do in this example. But when the linear models get much more complicated with uglier coefficients and far more variables, is there an elegant way to store their estimates in the df?

Upvotes: 1

Views: 1996

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226192

The predict() function does what you want (so does the fitted() function); the predict() function has more options (see ?predict.lm; as @Frank said in comments, this is linked from the See Also section of ?lm)

dd <- data.frame(x=1:4,y=c(2,3,5,8))
dd$est <- predict(lm(y~x,data=dd))

A good general book on modeling in R should tell you this (e.g. Dalgaard's Introductory Statistics with R, Julian Faraway's books) - I'm sure there are also a zillion tutorials online, although I can't point you to a specific one. One hint for finding out what you can do with a model is as follows:

m <- lm(y~x,data=dd) ## fitted model
class(m)   ## "lm"
methods(class="lm")
## [1] add1           alias          anova          case.names     coerce        
## [6] confint        cooks.distance deviance       dfbeta         dfbetas       
## [11] drop1          dummy.coef     effects        extractAIC     family        
## [16] formula        hatvalues      influence      initialize     kappa         
## [21] labels         logLik         model.frame    model.matrix   nobs          
## [26] plot           predict        print          proj           qr            
## [31] residuals      rstandard      rstudent       show           simulate      
## [36] slotsFromS3    summary        variable.names vcov         

Now you can try to guess whether any of these might be useful (or look up their help files via, e.g. ?confint.lm)

Upvotes: 1

Related Questions