Reputation: 1837
I am trying to create generic function to handle a data frame with multiple plausible values. What I want is to pass a formula to a function to perform a regression such as:
f <- MRPCM ~ DSEX + IEP + ELL3 + SDRACEM + PARED
The MRPCM
variable does not actually exist in the data frame. Instead five variables, MRPCM1
, MRPCM2
, MRPCM3
, MRPCM4
, and MRPCM5
do exist. What I want to do is iterate and update the formula (f
here) to create five formulas. Can this be done? The update.formula
function seems to work on the entire left or right side at a time. I should also note that in this example the variable I wish to change is the dependent variable so that update(f, MRPCM1 ~ .)
works. However, I will not know where the variable appears in the formula.
For example:
f <- MRPCM + DSEX ~ IEP + ELL3 + SDRACEM + PARED
update.formula(f, as.formula('MRPCM1 ~ .'))
Results in this (note that DSEX is missing now):
MRPCM1 ~ IEP + ELL3 + SDRACEM + PARED
Upvotes: 4
Views: 2956
Reputation: 162321
Here's a demonstration of one approach. A more sophisticated implementation might instead update the fitted linear model (see ?update
), but that goes beyond the immediate scope of your question.
## Make a reproducible example!!
df <-
setNames(as.data.frame(matrix(rnorm(96), ncol=8)),
c("MRPCM1","MRPCM2","MRPCM3","DSEX","IEP", "ELL3","SDRACEM","PARED"))
## Construct a template formula
f <- MRPCM ~ DSEX + IEP + ELL3 + SDRACEM + PARED
## Workhorse function
iterlm <- function(formula, data) {
## Find columns in data matching pattern on left hand side of formula
LHSpat <- deparse(formula[[2]])
LHSvars <- grep(LHSpat, names(data), value = TRUE)
## Run through matchded columns, repeatedly updating the formula,
## fitting linear model, and extracting whatever results you want.
sapply(LHSvars, FUN=function(var) {
uf <- update.formula(f, as.formula(paste(var, "~ .")))
coef(lm(uf, df))
})
}
## Try it
iterlm(f, df)
## MRPCM1 MRPCM2 MRPCM3
## (Intercept) 0.71638942 -0.3883355 0.22202700
## DSEX -0.07048994 -0.7478064 0.62590580
## IEP -0.22716821 -0.2381982 0.12205780
## ELL3 -0.44492392 0.1720344 0.41251561
## SDRACEM 0.21629235 0.4800773 0.02866802
## PARED 0.07885683 -0.2582598 -0.07996121
Upvotes: 6