Error in lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok, : 0 (non-NA) cases. But all columns contain at least one non-NA value

Question

An error is being returned from a linear regression saying that there is a column with 0 non-NA values, despite me checking each column and confirming that each column as > 10 non-NA values. I'd appreciate any suggestions on what to look into to diagnose this error.

> reg_CTR <- lm(formula = modelFormula_CTR, data = reg_data_lowCorr, weights = dailyImps, na.action = na.exclude)
Error in lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok,  : 
  0 (non-NA) cases
> min(apply(reg_data_lowCorr, 2, function(x) sum(!is.na(x))))
[1] 11
> sum(!is.na(c(NA,NA,NA)))
[1] 0
> sum(!is.na(c(NA,NA,1)))
[1] 1
> reg_CTR <- lm(formula = modelFormula_CTR, data = reg_data_lowCorr, weights = dailyImps, na.action = na.omit)
Error in lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok,  : 
  0 (non-NA) cases

The data has NA values, but those are necessary, hence the use of na.exclude.

From what I read online, I had a few ideas of things to look into, but nothing seemed to apply to this situation.

All columns are numeric

> sum(sapply(reg_data_lowCorr, is.factor))
[1] 0

Model formula is dynamically generated, so there's no risk of misspelling

selectedVars <- c(names(reg_data_lowCorr[,3:ncol(reg_data_lowCorr)]))
modelFormula_CTR <- as.formula(paste0('CTR000', " ~ ", paste(selectedVars, collapse = "+")))
reg_CTR <- lm(formula = modelFormula_CTR, data = reg_data_lowCorr, weights = dailyImps, na.action = na.exclude)

Joseph Clark McIntyre · Accepted Answer

I don't believe it's saying that you have a column which is all NA, I believe it means that there are no rows with no missing data. In the example below, note that both b and c have non-missing entries, but no row is complete.

> a <- 1:4
> b <- c(1, 2, NA, NA)
> c <- c(NA, NA, 1, 2)
> lm(a ~ b + c, na.action = na.exclude)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  0 (non-NA) cases

You could check by running something like table(rowSums(is.na(data[vars]))), where data is your dataset and vars are the variables in the model. See if there's anyone not missing any values.

Error in lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok, : 0 (non-NA) cases. But all columns contain at least one non-NA value

Answers (1)

Related Questions