Reputation: 31
Using data from the fivethirtyeight package...
library(fivethirtyeight)
grads <- college_recent_grads
Created a subset of the grads data to include desired variables
data <- grads[, c("men", "major_category", "employed",
"employed_fulltime_yearround", "p25th",
"p75th", "total")]
Then, I split the data
subset up by major category and omitted the one NA value in the data
majorcats <- split(data, data$major_category)
names(majorcats)
majorcats <- majorcats %>% na.omit()
And tried to run a regression model in a function called facts, where the user could specify x, y, and z, z being a major category (hence why I split up the data
subset by major_category)
facts <- function(x, y, z){
category <- majorcats[["z"]]
summary(lm(y ~ x, data = category))
}
Unfortunately, when I try to input variables into facts (that are part of the majorcats data set, such as
facts(men, p25th, Arts)
I get the error below:
Error in model.frame.default(formula = y ~ x, data = category,
drop.unused.levels = TRUE) :
invalid type (NULL) for variable 'y'
Called from: model.frame.default(formula = y ~ x, data = category,
drop.unused.levels = TRUE)
Browse[1]>
Can someone please explain what this error means, and how I might be able to fix it?
Upvotes: 0
Views: 84
Reputation: 107567
Simply pass the parameters as string literals and create a formula from string:
facts <- function(x, y, z){
category <- majorcats[[z]]
model <- as.formula(paste(y, "~", x))
# ALTERNATIVE: model <- reformulate(x, response=y)
summary(lm(model, data = category))
}
facts("men", "p25th", "Arts")
Upvotes: 1