Reputation: 6741
I have a data frame with two column:
x <- c(1, 2, 3, 4, NA, 5, 6)
y <- c(1, 2, 4, 5, 0, 5, 6)
my.df <- data.frame(x, y)
I want to keep only the rows where x != y.
What I did is this:
my.df <- subset(my.df, x != y)
What I expected was:
x y
3 4
4 5
NA 0
What I got was
x y
3 4
4 5
This is because, by a strange convention, NA != 0
is NA
.
I really want to keep the NA in the subset because I'm looking for the differences between the columns.
How to achieve this?
Upvotes: 3
Views: 88
Reputation: 11490
This would also work. Only select rows where the subtraction of x and y are different from zero
my.df[!((x-y) %in% 0 ),]
Upvotes: 5
Reputation: 887531
One option would be to create an |
condition to get those rows having NA
for 'x'
subset(my.df, x != y | is.na(x))
If there are also NA elements in 'y'
subset(my.df, x != y | is.na(x)|is.na(y))
Not clear about the situation where both 'x' and 'y' are NA. If that needs to be taken out as they are same
subset(my.df, (x != y | is.na(x)|is.na(y)) & !(is.na(x) & is.na(y)))
Upvotes: 4