MLPNPC
MLPNPC

Reputation: 525

Change all Factor NA's in dataset R

I have a dataset that I want to use for building a Decision Tree in R studio. I have quite a few factors which are empty. I want to change all Factors that are empty in the dataset to "No Data", I have over 100 of these so I don't want to do them one by one, I'd rather be able to change all of them at once.

Example of data (Please note that these are all factors, I know that when it's put into R they are numerics but I don't know how to show factors in a replicated way as I read in the data from a csv):

Outcome=c(1,1,1,0,0,0)
VarA=c(1,1,NA,0,0,NA)
VarB=c(0,NA,1,1,NA,0)
VarC=c(0,NA,1,1,NA,0)
VarD=c(0,1,NA,0,0,0)
VarE=c(0,NA,1,1,NA,NA)
VarF=c(NA,NA,0,1,0,0)
VarG=c(0,NA,1,1,NA,0)
df=as.data.frame(cbind(Outcome, VarA, VarB,VarC,VarD,VarE,VarF,VarG)) 

Upvotes: 1

Views: 133

Answers (2)

akrun
akrun

Reputation: 887981

When we have factor columns and wanted to replace one of the values with a new value, either call the factor again or add the new value as one of the levels of the factor before doing the change. Assuming that we have to recode for variables other than the first column, loop through the columns with lapply, add 'No Data' as one of the levels and then replace the NA elements with "No Data", and finally assign the list output to the columns of interest

df[-1] <- lapply(df[-1], function(x) {
        levels(x) <- c(levels(x), "No Data")
         replace(x, is.na(x), "No Data")
          }) 

Upvotes: 2

Georgery
Georgery

Reputation: 8127

You might try this:

df[is.na(df)] <- "NoData"

Upvotes: 0

Related Questions