Reputation: 1203
I would like to change blanks (no value) to missing (NA
). I assumed this happens automatically when R read data (csv in my case), but there are just blanks so I tried:
is.na(data) <- data==""
I also tried:
data <- read.table("data.csv", header=TRUE, sep=";", na.strings="")
data[data==""] <- NA
But blanks remain. How can I solve this?
Upvotes: 3
Views: 17472
Reputation: 887118
To show that the code works:
data <- data.frame( col1= c("", letters[1:4]), col2=c(letters[1:4], ""))
is.na(data) <- data==''
data
# col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d <NA>
Suppose, if you have ''
along with spaces ' '
, this won't work
data <- data.frame( col1= c("", letters[1:4]), col2=c(letters[1:4], " "))
data1 <- data
is.na(data) <- data==''
data
col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d
In such cases, you could use str_trim
library(stringr)
data1[] <- lapply(data1, str_trim)
is.na(data1) <- data1==''
data1
# col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d <NA>
Upvotes: 6
Reputation: 92292
Just use na.strings = ""
when reading the data, for example
test1 <- data.frame(A = 1:6, B = c("6","7", "",3, "","7")) # Assuming this is your data
test1
# A B
# 1 1 6
# 2 2 7
# 3 3
# 4 4 3
# 5 5
# 6 6 7
tf <- tempfile() # Creating some temp file for illustration
write.csv(test1, tf, row.names = F) # Saving the dummy data on the hard disk
read.csv(tf, na.strings = "") # Reading it back while specifying na.strings = ""
# A B
# 1 1 6
# 2 2 7
# 3 3 NA
# 4 4 3
# 5 5 NA
# 6 6 7
Upvotes: 5
Reputation: 179428
Try this:
x <- c("a", "", "b", "", "1")
x
x[x==""] <- NA
x
The results:
[1] "a" NA "b" NA "1"
Upvotes: 9