Reputation: 151
I have a large number of regression equations that I would like to save in R and I am not sure how to do it efficiently. For example,
y1 ~ x1 + x2 + x3 + x4 (country A)
y1 ~ x1 + x2 + x4 (country B)
y1 ~ x1 + x2 + x3 + x4 (country C)
y1 ~ + x3 + x4 (country D)
Ideally, I would like to be able to answers such as how many times x2 occurred ? 3. what is the most common variable ? x4
Should I save everything in a list ? or is there a better method ?
Upvotes: 2
Views: 125
Reputation: 308938
Wouldn't you prefer a model like this and glm?
y ~ x1 + x2 + x3 + x4 + country
Let the regression tell you which factors drop out.
Upvotes: 0
Reputation: 44555
Put them in a list:
myformulas <-
list(a = y1 ~ x1 + x2 + x3 + x4,
b = y1 ~ x1 + x2 + x4,
c = y1 ~ x1 + x2 + x3 + x4,
d = y1 ~ + x3 + x4)
You can then perform operations on them like:
# what variables are in which formulae
> str(lapply(myformulas, function(x) attr(terms(x), 'term.labels')))
List of 4
$ a: chr [1:4] "x1" "x2" "x3" "x4"
$ b: chr [1:3] "x1" "x2" "x4"
$ c: chr [1:4] "x1" "x2" "x3" "x4"
$ d: chr [1:2] "x3" "x4"
# where is `x1` used?
> str(lapply(myformulas, function(x) 'x1' %in% attr(terms(x), 'term.labels')))
List of 4
$ a: logi TRUE
$ b: logi TRUE
$ c: logi TRUE
$ d: logi FALSE
# how many times is each variable used?
> table(unlist(lapply(myformulas, function(x) attr(terms(x), 'term.labels'))))
x1 x2 x3 x4
3 3 3 4
From that structure you could easily answer your questions about the use of different variables.
Upvotes: 4