Reputation: 195
I am trying to do a generic function to construct a formula for lineal regression. I want that the function create the formula either
I can create the formula using all the variables present in the dataframe but my problem is when I try to get the user defined variables, I do not know exactly how to get the variables to later use them to create the formula.
The function that I have until now is this:
lmformula <- function (data, IndepVariable = character, VariableList = TRUE){
if (VariableList) {
newlist <- list()
newlist <- # Here is where I do not exactly what to do to extract the variables defined by user
DependVariables <- newlist
f <- as.formula(paste(IndepVariable, "~", paste((DependVariables), collapse = '+')))
}else {
names(data) <- make.names(colnames(data))
DependVariables <- names(data)[!colnames(data)%in% IndepVariable]
f <- as.formula(paste(IndepVariable,"~", paste((DependVariables), collapse = '+')))
return (f)
}
}
Please any hint will be deeply appreciated
Upvotes: 0
Views: 300
Reputation: 386
The only thing that changes is how you get the independent variables
If the user specifies them, then use that character vector directly
Else, you have to to take all the variables other than the dependent variable(which you are already doing)
Note : As Roland mentioned, the formula is like dependentVariable ~ independentVariable1 + independentVariable2 + independentVariable3
# creating mock data
data <- data.frame(col1 = numeric(0), col2 = numeric(0), col3 = numeric(0), col4 = numeric(0))
# the function
lmformula <- function (data, DepVariable, IndepVariable, VariableList = TRUE) {
if (!VariableList) {
IndepVariable <- names(data)[!names(data) %in% DepVariable]
}
f <- as.formula(paste(DepVariable,"~", paste(IndepVariable, collapse = '+')))
return (f)
}
# working examples
lmformula(data = data, DepVariable = "col1", VariableList = FALSE)
lmformula(data = data, DepVariable = "col1", IndepVariable = c("col2", "col3"), VariableList = TRUE)
Hope it helps!
Upvotes: 2