user572780
user572780

Reputation: 379

Loop over data table columns and apply glm

I am trying to loop over my data table columns and apply glm to each column using a for-loop. I then want to extract regression coefficients from the model and add them to my output data table.

Here dt is a data table and y is a vector:

output = data.table('reg_coef' = numeric())
for(n in 1:ncol(dt)){
  model = glm(y ~ dt[, n], family=binomial(link="logit"))
  reg_coef = summary(model)$coefficients[2]
  output = rbindlist(list(output, list(reg_coef)))
}

Why doesn't this work? I am getting this error:

Error in `[.data.table`(dt, , n) : 
  j (the 2nd argument inside [...]) is a single symbol but column name 'n' is not found. Perhaps you intended DT[, ..n]. This difference to data.frame is deliberate and explained in FAQ 1.1. 

Upvotes: 0

Views: 262

Answers (2)

akrun
akrun

Reputation: 887158

We can use paste

output <- do.call(rbind, lapply(names(dt), function(x) {
 model <- glm(as.formula(paste0('y ~ ', x)), dt, family=binomial(link="logit"))
  summary(model)$coefficients[2]
   }))

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388982

You can apply the model and extract the coefficients in the same loop. Using lapply :

output <- do.call(rbind, lapply(names(dt), function(x) {
  model <- glm(reformulate(x, 'y'), dt, family=binomial(link="logit"))
  summary(model)$coefficients[2]
}))

Upvotes: 1

Related Questions