lydias
lydias

Reputation: 841

R Regression imputation on missing data

Hi! I'm trying to apply regression imputation on miss values of a dataset 'chmiss' from package 'faraway' and library 'faraway', but the code I have so far is having trouble to fit regression with dataframe when dropping a column happens the same time. Could anyone give me a hand on correcting the code?

X <- chmiss
for(j in c(1:4,6)){
     new_Y <- X[,j]
     new_X <- X[,c(-j,-5)]
     new_XY <- cbind(new_X,new_Y)
     temp_lm <- lm(new_Y~.,data=new_XY)
     X[is.na(new_Y),j] <- predict(temp_lm,new_X[is.na(new_Y),c(-j,-5)])
}

Upvotes: 1

Views: 230

Answers (1)

Bastien
Bastien

Reputation: 3097

Try this:

library(faraway)
data(chmiss)
X <- chmiss
for(j in c(1:4,6)){
  new_Y <- X[,j]
  new_X <- X[,c(-j,-5)]
  new_XY <- cbind(new_X,new_Y)
  temp_lm <- lm(new_Y~.,data=new_XY)
  X[is.na(new_Y),j] <- predict(temp_lm,new_X[is.na(new_Y),]) ## difference here
}

You remove the columns c(-j,-5) already to create new_X, so when you do it again for the predict call it drop useful columns instead.

Upvotes: 1

Related Questions