Reputation: 243
I was wondering if there was a shortcut or a symbol to include all variables with a similar name.
For instance, if I have a regression and I have 50 time dummies of the form year1
, year2
, year3
in Stata I can include all of these by writing year*
.
Is there a similar functionality in R? I know I can do something like factor(year)
to get the same effect but for a specific reason I need to have many time dummies.
Upvotes: 1
Views: 1139
Reputation: 226577
A variation on @Lyzander's answer and @G5W's comment, using reformulate()
:
yearvars <- grep("^year",names(myData), value=TRUE)
form <- reformulate(c("othervar1","othervar2",yearvars),response="stuff")
The result of form
will be stuff ~ othervar1 + othervar2 + year1 + year2 + ...
lm(form, data=myData)
Upvotes: 3
Reputation: 37879
In R you would use a formula to define which of the dummy variables to include. So, to include year1, year2 and year3 in your model you would create a formula using paste
and as.formula
:
formula <- as.formula(paste('y ~', paste0('year', 1:3, collapse = ' + ')))
formula
#y ~ year1 + year2 + year3
lm(formula, data = data)
For an lm
model you could skip the as.formula
function because the string gets automatically transformed into a formula inside of lm
but other models require it.
An alternative is to filter your data.frame to include all the variables you would need and then use y ~ .
as the formula.
Upvotes: 3