Hiterunner
Hiterunner

Reputation: 83

How do I exclude rows in R based on multiple values?

Let's say I have a dataset that looks like this:

> data
  iso3 Vaccine Coverage
1  ARG    DPT3       95
2  ARG     MCV       94
3  ARG    Pol3       91
4  KAZ    DPT3       99
5  KAZ     MCV       98
6  KAZ    Pol3       99
7  COD    DPT3       67
8  COD     MCV       62
9  COD    Pol3       66

I want to filter out some records based on several conditions being met simultaneously; say, I want to drop any data from Argentina (ARG) with a coverage of more than 93 percent. The result should thus exclude rows 1 and 2:

  iso3 Vaccine Coverage
3  ARG    Pol3       91
4  KAZ    DPT3       99
5  KAZ     MCV       98
6  KAZ    Pol3       99
7  COD    DPT3       67
8  COD     MCV       62
9  COD    Pol3       66

I tried using subset() but it excludes too much:

> subset(data, iso3!="ARG" & Coverage>93)
  iso3 Vaccine Coverage
4  KAZ    DPT3       99
5  KAZ     MCV       98
6  KAZ    Pol3       99

The problem seems to be that the & operator doesn't seem to work like the boolean AND, returning the intersection of the two conditions. Instead, it functions like a boolean OR, returning their union.

My question is, what do I use here to force the boolean AND?

Upvotes: 2

Views: 42688

Answers (1)

mnel
mnel

Reputation: 115382

!= is an operator meaning "not equal".

! indicates logical negation (NOT)

Your condition

iso3!="ARG" & Coverage>93

is

(iso3 not equal to "ARG") AND (Coverage > 93)

If you want

NOT((iso equal to "ARG") AND (Coverage > 93))

You need to create a condition appropriately, eg

eg

!(iso == 'ARG' & Coverage > 93)

For a complete coverage of logical operators in base R see

help('Logic', package='base')

Upvotes: 13

Related Questions