Reputation: 17
I have a dataframe with some columns of type "factor" and others "numeric". There are no missing values in any of the "factor" columns.
I am trying to replace missing values in each column with column median using the following code:
for(i in 1:ncol(df3)){
df3[is.na(df3[,i]), i] <- median(df3[,i], na.rm = TRUE)
}
However I am getting the error:
Error in median.default(df3[, i], na.rm = TRUE) : need numeric data
I am sure that there are missing values only in numeric column, why am I getting this error?
More importantly, how do I fill missing values in each column with respective column medians?
Upvotes: 0
Views: 517
Reputation: 1045
Even if df3[is.na(df3[, i]), i]
has zero rows, R still needs to calculate the RHS median(df3[,i], na.rm = TRUE)
. You could add a check to only replace missing values in numeric columns:
for(i in seq_along(df3)) {
if (is.numeric(df3[, i])) {
df3[is.na(df3[, i]), i] <- median(df3[, i], na.rm = TRUE)
}
}
Upvotes: 1