Reputation: 331
The title may sound kinda weird but I have found no way of better defining my issue.
Here an example data set:
test = data.frame(genus = c("Acicarpha", "Acicarpha", "Acicarpha", "Acicarpha", "Acisanthera", "Acisanthera", "Acisanthera", "Acisanthera", "Acmella", "Acmella"), sp1 = c("NA", "bonariensis", "bonariensis", "spathulata", NA, "variabilis", "variabilis", "variabilis", NA, NA))
As you can see, I have a few species names (genus+sp1
) possible: Acicarpha NA, Acicarpha bonariensis, Acicarpha spathulata, Acisanthera variabilis, Acisanthera NA, and Acmella NA.
Here's the deal: I'm trying to select only the row related to Acmella NA since the only returning value on the sp1
column is NA. Other species also have NA, but they do not have only NA. How can I do this? I'm bashing my head.
Upvotes: 0
Views: 154
Reputation: 887501
We may use subset
from base R
subset(test, !genus %in% genus[!is.na(sp1)])
genus sp1
9 Acmella <NA>
10 Acmella <NA>
Or with filter
from dplyr
library(dplyr)
test %>%
filter(!genus %in% genus[!is.na(sp1)])
Upvotes: 1
Reputation: 631
Here's some code that does what I think you're asking for. It has four steps:
genus
.all_sp1_na
that is TRUE
if all of each genus's sp1
observations are NA
, FALSE
otherwise (i.e. FALSE
if at least one sp1
observation is not NA
for that genus).all_sp1_na
is true.all_sp1_na
.library(tidyverse)
test %>%
group_by(genus) %>%
mutate(all_sp1_na = all(is.na(sp1))) %>%
filter(all_sp1_na) %>%
select(-all_sp1_na)
And it gives this result:
# A tibble: 2 x 2
# Groups: genus [1]
genus sp1
<chr> <chr>
1 Acmella NA
2 Acmella NA
Let me know if you're looking for something else.
Upvotes: 1