user2334207
user2334207

Reputation: 25

Replacing all NA numbers in a list with the list's mean

I'm totally new to R, and I've been trying to replace the NA values with the mean value for each column. I've tried a lot of options. but none seems to work. I've tried this one and many similar ones but i keep on getting: argument is not numeric or logical: returning NA.

script<-function() {
for (i in names(data)) {
        data[[i]][is.na(data[[i]])] <- mean(data[[i]], na.rm=TRUE);
    }
}

But then after a while I thought I'd just count the columns and came up with this:

script<-function() {
    for (i in 1:20) {
        data[[i]][is.na(data[[i]])] <- mean(data[[i]], na.rm=TRUE);
    }
}

which doesn't show any errors, but doesn't seem to work either. When I type in data it's just the same data frame, but unedited. Could anyone help me with this?

Upvotes: 0

Views: 129

Answers (2)

topchef
topchef

Reputation: 19783

Feel free to make a function out of this (updated per mnel correction):

data.frame(lapply(data, function(x){replace(x, is.na(x), mean(x,na.rm=T))}))

Upvotes: 0

mnel
mnel

Reputation: 115392

The problem with your function is that it is a function, and thus the scoping only updates data within the scope of the function

running

for (i in names(data)) {
        data[[i]][is.na(data[[i]])] <- mean(data[[i]], na.rm=TRUE);
            }
       }

Not within a function will work as you wish.

Another approach would be to pass data as an argument

imputeMean <-function(data) {

    for (i in names(data)) {
    data[[i]][is.na(data[[i]])] <- mean(data[[i]], na.rm=TRUE);
        }
    return(data)
   }
# then you can save the result as a new object

updatedData <- imputeMean(data)

Note that for named lists (as data is), [[<- will copy every time, so you could get around this by using lapply

updatedData <- lapply(data, function(x) replace(x, is.na(x), mean(x, na.rm = TRUE)))

Upvotes: 5

Related Questions