Reputation: 51
I am new to R and started exploring na.strings = c()
function along with read.csv
.
I have read that using this option, all the missing values will be replaced to NA but I don’t see that happening in my files. I don’t see any difference in the output despite using na.strings = c()
. Please help if I am missing something. In both the cases, I see NA when numeric value is missing but not when char value is missing. So, what is the use of using this function?
Here is my sample csv file:
Char,Numeric
A,3
B,
,5
And my code:
DF_withoutNA = read.csv("filepath/R_NA.csv",header = TRUE)
DF_with = read.csv("filepath /R_NA.csv",header = TRUE,
na.strings = c("Char","Numeric"))
head(DF_withoutNA)
Char Numeric
1 A 3
2 B NA
3 5
head(DF_with)
Char Numeric
1 A 3
2 B NA
3 5
Upvotes: 3
Views: 48575
Reputation: 47
Try this:
new_R_NA <- read.csv("filepath/R_NA.csv", header= TRUE, na.strings=c(""," ","NA"))
With this, you may not be able to notice the changes if you run the function View(new_R_NA). However, rest assured it works behind the scenes. Thus, you might need to open the file with Excel to see the changes.
***NOTE: Notice how I used both "" and " " in the na.strings=c(""," ","NA") function ***
Two things to keep in mind here, make sure you are capturing all of the blanks (" ") as well as the cells without blanks ("").
Also, avoid using the na.strings=c() function with the read_excel(). Somehow, it doesn't work as well as with the read.csv().
Upvotes: 0
Reputation: 1
na.string replaces the missing values with 'NA' as a notation. This needs to be done preferably at the beginning of the data cleaning process.
Upvotes: -1
Reputation:
The na.strings
argument is for substitution within the body of the file, that is, matching strings that should be replaced with NA
. So with your example if you pass the empty string ""
it should match your missing character string, which is stripped white space.
x <- read.csv("filepath/R_NA.csv",header=TRUE,na.strings=c(""))
x
Char Numeric
1 A 3
2 B NA
3 <NA> 5
Upvotes: 6
Reputation: 28339
what is the use of using this function?
It replaces values (eg., characters, numbers) in you csv file with NA
. If you try read.csv("filepath/R_NA.csv", na.strings = "A")
you'll see that all A
's in csv were replaced with NA
's.
PS. na.strings
is the argument, not the function.
Upvotes: 4