How to replace missing categorical data from large dataset in R?

Question

I use the dataset from https://www.kaggle.com/datasets/shilongzhuang/telecom-customer-churn-by-maven-analytics
Here there are many categorical values with missing datapoints. I am not sure how to deal with these missing values. Since almost every row has at least one missing value I can't just delete the rows. Using mean/mode also is not applicable to this dataset.

What can I do best to handle these missing values?

For example I tried to impute the variable Multiple.Lines like this:

telecom_customer_churn $ Multiple.Lines = impute(telecom_customer_churn$Multiple.Lines, "random")

This works, but when I try to make a bar plot like this:

ggplot(data = telecom_customer_churn) +
  geom_histogram(mapping = aes(x = Multiple.Lines),  color = "blue", fill = "lightblue")

It shows me the error:

Error: Discrete value supplied to continuous scale

This is weird to me because the all the missing values of Multiple.Lines are replaced by either 'yes' or 'no'.

How to replace missing categorical data from large dataset in R?

Answers (1)

Related Questions