Alex Holcombe
Alex Holcombe

Reputation: 2591

Converting R formula format to mathematical equation

When we fit a statistical model in R, say

lm(y ~ x, data=dat)

We use R's special formula syntax: "y~x"

Is there something that converts from such a formula to the corresponding equation? In this case it could be written as:

y = B0 + B1*x

This would be very useful! For one, because with more complicated formulae I don't trust my translation. Second, in scientific papers written with R/Sweave/knitr, sometimes the model should be reported in equation form and for fully reproducible research, we'd like to do this in automated fashion.

Upvotes: 9

Views: 2239

Answers (1)

Sam Mason
Sam Mason

Reputation: 16214

Just had a quick play and got this working:

# define a function to take a linear regression
#  (anything that supports coef() and terms() should work)
expr.from.lm <- function (fit) {
  # the terms we're interested in
  con <- names(coef(fit))
  # current expression (built from the inside out)
  expr <- quote(epsilon)
  # prepend expressions, working from the last symbol backwards
  for (i in length(con):1) {
    if (con[[i]] == '(Intercept)')
        expr <- bquote(beta[.(i-1)] + .(expr))
    else
        expr <- bquote(beta[.(i-1)] * .(as.symbol(con[[i]])) + .(expr))
  }
  # add in response
  expr <- bquote(.(terms(fit)[[2]]) == .(expr))
  # convert to expression (for easy plotting)
  as.expression(expr)
}

# generate and fit dummy data
df <- data.frame(iq=rnorm(10), sex=runif(10) < 0.5, weight=rnorm(10), height=rnorm(10))
f <- lm(iq ~ sex + weight + height, df)
# plot with our expression as the title
plot(resid(f), main=expr.from.lm(f))

Seems to have lots of freedom about what variables are called, and whether you actually want the coefficients in there as well—but seems good for a start.

Upvotes: 5

Related Questions