Reputation: 31
I have a census dataset with some missing variables indicated with a ?
,
When checking for incomplete cases in R it says there are none because R takes the ?
as a valid character. Is there any way to change all the ?
to NA
s? I would like to run multiple imputation using the mice package to fill in the missing data after.
Upvotes: 0
Views: 1754
Reputation: 3833
Creating data frame df
df <- data.frame(A=c("?",1,2),B=c(2,3,"?"))
df
# A B
# 1 ? 2
# 2 1 3
# 3 2 ?
I. Using replace()
function
replace(df,df == "?",NA)
# A B
# 1 <NA> 2
# 2 1 3
# 3 2 <NA>
II. While importing a file with ?
data <- read.table("xyz.csv",sep=",",header=T,na.strings=c("?",NA))
data
# A B
# 1 1 NA
# 2 2 3
# 3 3 4
# 4 NA NA
# 5 NA NA
# 6 4 5
Upvotes: 2
Reputation: 2244
Data frames. You may need to fiddle with the quotation marks. I have not tested this.
df[df == "?"] <- NA
Upvotes: 4