Mamba
Mamba

Reputation: 1203

How to replace blank strings with NA?

I would like to change blanks (no value) to missing (NA). I assumed this happens automatically when R read data (csv in my case), but there are just blanks so I tried:

is.na(data) <- data==""

I also tried:

data <- read.table("data.csv", header=TRUE, sep=";", na.strings="")
data[data==""] <- NA

But blanks remain. How can I solve this?

Upvotes: 3

Views: 17472

Answers (3)

akrun
akrun

Reputation: 887118

To show that the code works:

data <- data.frame( col1= c("", letters[1:4]), col2=c(letters[1:4], ""))
 is.na(data) <- data==''
 data
 #  col1 col2
 #1 <NA>    a
 #2   a    b
 #3    b    c
 #4    c    d
 #5    d <NA>

Suppose, if you have '' along with spaces ' ', this won't work

 data <- data.frame( col1= c("", letters[1:4]), col2=c(letters[1:4], " "))
 data1 <- data
 is.na(data) <- data==''
  data
  col1 col2
 #1 <NA>    a
 #2    a    b
 #3    b    c
 #4    c    d
 #5    d     

In such cases, you could use str_trim

  library(stringr)
  data1[] <- lapply(data1, str_trim)
  is.na(data1) <- data1==''
  data1
  #  col1 col2
  #1 <NA>    a
  #2    a    b
  #3    b    c
  #4    c    d
  #5    d <NA>

Upvotes: 6

David Arenburg
David Arenburg

Reputation: 92292

Just use na.strings = "" when reading the data, for example

test1 <- data.frame(A = 1:6, B = c("6","7", "",3, "","7")) # Assuming this is your data
test1
#   A B
# 1 1 6
# 2 2 7
# 3 3  
# 4 4 3
# 5 5  
# 6 6 7

tf <- tempfile() # Creating some temp file for illustration
write.csv(test1, tf, row.names = F) # Saving the dummy data on the hard disk
read.csv(tf, na.strings = "") # Reading it back while specifying na.strings = ""
#   A  B
# 1 1  6
# 2 2  7
# 3 3 NA
# 4 4  3
# 5 5 NA
# 6 6  7

Upvotes: 5

Andrie
Andrie

Reputation: 179428

Try this:

x <- c("a", "", "b", "", "1")
x
x[x==""] <- NA
x

The results:

[1] "a" NA  "b" NA  "1"

Upvotes: 9

Related Questions