Reputation: 1
With this program below, I will get the error:
Error in solve.default(sig[!pick.miss, !pick.miss]) : 'a' is 0-diml
I want to use EM algorthim to impute the mising values.This function can work if the missing values is small(from 5% to 20% of the dataset). But if the missing values is bigger as 30% of the dataset, running the program I will got above error. I am very confused and desperate for help, Thank you very much.
library(e1071)
EMalg <- function(x, tol=.001){
missvals <- is.na(x)
new.impute<-x
old.impute <- x
count.iter <- 1
reach.tol <- 0
sig <- as.matrix(var(na.exclude(x)))
mean.vec <- as.matrix(apply(na.exclude(x),2,mean))
while(reach.tol != 1) {
for(i in 1:nrow(x)) {
pick.miss <-( c( missvals[i,]) )
if ( sum(pick.miss) != 0 ) {
inv.S <- solve(sig[!pick.miss,!pick.miss])
new.impute[i,pick.miss] <- mean.vec[pick.miss] +
sig[pick.miss,!pick.miss] %*%
inv.S %*%
(t(new.impute[i,!pick.miss])- t(t(mean.vec[!pick.miss])))
}
}
sig <- var((new.impute))
mean.vec <- as.matrix(apply(new.impute,2,mean))
if(count.iter > 1){
for(l in 1:nrow(new.impute)){
for(m in 1:ncol(new.impute)){
if( abs((old.impute[l,m]-new.impute[l,m])) > tol ) {
reach.tol < - 0
} else {
reach.tol <- 1
}
}
}
}
count.iter <- count.iter+1 # used for debugging purposes to ensure process it iterating properly
old.impute <- new.impute
}
return(new.impute)
}
modelerData<-read.csv(file.choose(), header=TRUE)
imputed <- EMalg(modelerData, tol=.0001)
Upvotes: 0
Views: 287