Reputation: 1756
Just a very quick question, I want to run the regression using MASS. The dependent variable are val1, val2, val3 respectively and independent variables are a, b, c, d.
Just look at the fake data.
library(data.table)
library(MASS)
test <- data.table(val1 = 1:10, val2 = 11:20, val3 = 21:30, a = rnorm(10), b = rnorm(10), c = rnorm(10), d = rnorm(10))
summary1 <- glm.nb(val1 ~ a + b + c + d, data = test)
summary2 <- glm.nb(val2 ~ a + b + c + d, data = test)
summary3 <- glm.nb(val3 ~ a + b + c + d, data = test)
I think the code is ugly. I tried this
for (i in c("val1", "val2", "val3")){
paste("sum_", c("val1", "val2", "val3"), sep = "") <- glm.nb(i ~ a + b + c + d, data = simple)
}
But it didn't work. Any suggestions about the improvements?
In the original data, there're about 26 independent variables, and I think it will be more ugly if the code is like this sum1 <- glm.nb(val3 ~ a + b + c + d + e + f+ g + h + i + j + k + l, data = test)
I know the following code might be helpful, but I don't know how to use them...:(
diff <- setdiff(colnames(test),c('val1','val2','val3'))
Also, I wonder whether lapply function can achieve this within data.table?
Thanks a lot!
Upvotes: 4
Views: 230
Reputation: 18323
And here is a solution with lapply
:
summary.list<-lapply(test[, .SD, .SDcols=patterns('val')],
function(i) glm.nb(i ~ a + b + c + d, data = test))
Upvotes: 1
Reputation: 121608
Better to put your data in the long format :
library(plyr)
library(reshape2)
xx <- melt(test,measure.vars=paste0('val',1:3))
ddply(xx,.(variable),function(x){
coef(glm.nb(value~.,data=subset(x,select=-variable)))
})
variable (Intercept) a b c d
1 val1 1.583602 -0.045909060 -0.018189342 0.026293033 0.29708648
2 val2 2.704601 -0.014641683 -0.003836401 0.006711503 0.10445377
3 val3 3.217729 -0.008925782 -0.001863267 0.003475509 0.06292286
If you want all the model not just the coefficients:
dlply(xx,.(variable),function(x){
glm.nb(value~.,data=subset(x,select=-variable))
})
Upvotes: 5
Reputation: 1446
Using your loop approach I would simply store all my models in a list like so
results <- list()
for (i in c("val1", "val2", "val3")){
frml <- paste(i, "~ a + b + c + d")
frml <- as.formula(frml)
results[[i]] <- glm.nb(frml, data = simple)
}
And then access the models in the list by looking at results$val1
etc.
Upvotes: 2