Reputation: 839
My data looks like this after import
A = data.frame( ID= c(1,2,3,4,5,6), Name = c(NA,"A",NA,NA,NA,"B"))
>A
ID Name
1 <NA>
2 A
3 <NA>
4 <NA>
5 <NA>
6 B
I expect this result, when I select all rows with Name=="A":
ID Name
2 2 A
However, I get 5 rows:
> A[A$Name=="A",]
ID Name
NA NA <NA>
2 2 A
NA.1 NA <NA>
NA.2 NA <NA>
NA.3 NA <NA>
Note that I do not look for complete.cases()
since there are many more columns in the data frame. And I also did specify the na.strings
parameter during read.csv(...,na.strings = NA)
. The missing values are not "NA" but NA in the csv file and playing around with that during import did not change anything.
Upvotes: 1
Views: 125
Reputation: 24074
You can also use %in%
instead of ==
:
A[A$Name %in% "A", ]
# ID Name
#2 2 A
Upvotes: 2
Reputation: 887118
Here is a way by setting 'Name' as the key column after converting to data.table
.
library(data.table)
setDT(A, key='Name')['A']
# ID Name
#1: 2 A
Upvotes: 1
Reputation: 5424
Yes, this is apparently desired behaviour of R.
Try
A = data.frame( ID= c(1,2,3,4,5,6), Name = c(NA,"A",NA,NA,NA,"B"))
A[A$Name=="A" & !is.na(A$Name),]
ID Name
2 2 A
This is because comparing NA to a value equates to NA and not TRUE or FALSE
"B" == "A"
[1] FALSE
"A" == "A"
[1] TRUE
NA == "A"
[1] NA
Upvotes: 1
Reputation: 1632
To see the result you need, try this:
> subset(A,Name=="A")
ID Name
2 2 A
Upvotes: 5