r_master
r_master

Reputation: 3

Regressing over a data frame matrix without intercept using lm

I would like to regress over a data frame matrix using lm without the intercept. If your matrix is stored as a data frame, you can simply call lm(matrix) and the first column is assumed to be the dependent variable while the rest are taken as the independent variables, with the regression including an intercept. My question is: how do I efficiently do the same if I want to regress without including the intercept?

Minimal working example:

mat <- matrix(c(2, 4, 3, 1, 5, 7, 3, 5, 30), nrow=3, ncol=3)
mat <- data.frame(mat)
lm(mat)

outputs a regression with an intercept term

Upvotes: 0

Views: 559

Answers (1)

Anders Ellern Bilgrau
Anders Ellern Bilgrau

Reputation: 10223

It depends what you mean by "efficent".

If you mean syntactically brief/efficient, then I think the most elegant way is do provide the formula directly as @nicola shows in the comments (lm(X1 ~ . + 0, data = mat)).

If you mean removing the intercept programatically (programatically efficient, I guess), then the below code will do that.

mat <- matrix(c(2, 4, 3, 1, 5, 7, 3, 5, 30), nrow=3, ncol=3)
mat <- data.frame(mat)

lm(update(as.formula(mat), . ~ . - 1), data = mat)
#
#Call:
#lm(formula = update(as.formula(mat), . ~ . - 1), data = mat)
# 
#Coefficients:
#     X2       X3  
# 0.9364  -0.1144  

Note, that when you call lm(mat), lm will try to coerce mat to a formula object (try to run as.formula(mat)) and use that formula. As you can see (and have noted), this automatically selects the first column as the dependent variable and the remaining as explanatory variables. All we need to do then, is updating that formula to not include the intercept using update.

Upvotes: 1

Related Questions