user102546
user102546

Reputation: 17

Regression in a loop but get "Error in coef(summary(fit))[2, 4] : subscript out of bounds"

When running the following code on a high dimensional dataset I get the error message

Error in coef(summary(fit))[2, 4] : subscript out of bounds

The vector beta which the p-values of the logistic regression models are saved to has length 19481. If I loop through the different independent variables of the regression model up to 100 times I do not get this error.

Could anyone give me hint why my code does not run smoothly?

beta = rep(0, 19481)
for (i in 25:19505) {
  fit = glm(mdr.mdr ~ an.mdr[,i], family=binomial)
  beta[i-24] = coef(summary(fit))[2,4]
  }

Upvotes: 0

Views: 1771

Answers (2)

Zheyuan Li
Zheyuan Li

Reputation: 73325

As the error comes when you try to extract [2,4] element of the coefficient table, that is, the p-value of the slope, I am sure you have NA estimate for the slope.

This means that for some i, your model is rank-deficient and there is no information to estimate the slope.

Note that, coef(summary(fit)) will drop NA estimate, so in this situation your coefficient table only has one row instead of two rows (which explains the "out-of-bound" error). See Coefficient table does not have NA rows in rank-deficient fit; how to insert them?

I suggest the following:

beta = rep(NA, 19481)
for (i in 25:19505) {
  fit = glm(mdr.mdr ~ an.mdr[,i], family = binomial)
  slope <- coef(fit)[2]
  if (!is.na(slope)) beta[i-24] = coef(summary(fit))[2,4]
  }

Another potential failure of this loop is "no complete cases", that is, sum(complete.cases(mdr.mdr, an.mdr[, i])) gives you 0. If this does happen, you might want:

beta = rep(NA, 19481)
for (i in 25:19505) {
  if (sum(complete.cases(mdr.mdr, an.mdr[, i])) > 0) {
    fit = glm(mdr.mdr ~ an.mdr[,i], family = binomial)
    slope <- coef(fit)[2]
    if (!is.na(slope)) beta[i-24] = coef(summary(fit))[2,4]
    }
  }

Upvotes: 1

stevec
stevec

Reputation: 52308

Run the loop, and when it stops, type i into the console and press enter. This will tell you which loop iteration the loop failed at. Then inspect an.mdr[,i] for anything unexpected

Try this:

  beta = rep(0, 19481) 
  for (i in 25:19505) {
      fit = glm(mdr.mdr ~ an.mdr[,i], family=binomial) 
      if (is.na(coef(summary(fit))[2,4]) { beta[i-24] = NA }
      if (!is.na(coef(summary(fit))[2,4]) { beta[i-24] = coef(summary(fit))[2,4] }
  }

Upvotes: 0

Related Questions