tadeufontes
tadeufontes

Reputation: 477

Why are the NAs being ignored while I'm using the ifelse/mutate functions?

So I have a data frame with several occurences of different species and a "new_name" empty column that I want to fill with mutate/ifelse. Basically I want the new_name to be filled according to these conditions: if the status is unaccepter I want the new_name to be the value of "valid_name" and if the status is accepted or NA I want the new_name to take the value of "species". This is an example of how my data frame is structured: ´´´

example of the data frame

         species           valid_name                 new_name    status
1.  Tilapia guineensis |         NA                 |  NA       | NA

2.     Tilapia zillii  |  Hippocampus trimaculatus  |  NA       | unaccepted

3. Fundulus rubrifrons |  Hippocampus trimaculatus  |  NA       | unaccepted

4.  Eutrigla gurnardus |  Bougainvillia supercili   |  NA       | accepted

5.   Sprattus sprattus |        NA                  |  NA       | NA

6.        Gadus morhua |  Aglantha digitale         |  NA       | accepted

´´´

So far I tried the following:

df<-df%>%
  mutate(new_name = ifelse(status=="unaccepted",valid_name,ifelse(status=="accepted" | is.na(status),species,NA)))

So this code is working only for the values of "status" that don't have NAs. Otherwise it just ignores the NAs and does nothing. So the data frame becomes somthing like this:

             species           valid_name                 new_name    status
    1.  Tilapia guineensis |         NA                 |  Tilapia guineensis             | NA
    
    2.     Tilapia zillii  |  Hippocampus trimaculatus  |  Hippocampus trimaculatus   | unaccepted
    
    3. Fundulus rubrifrons |  Hippocampus trimaculatus  |  Hippocampus trimaculatus   | unaccepted
    
    4.  Eutrigla gurnardus |  Bougainvillia supercili   |  Eutrigla gurnardus         | accepted
    
    5.   Sprattus sprattus |        NA                  |  Sprattus sprattus             | NA
    
    6.        Gadus morhua |  Aglantha digitale         |  Gadus morhua               | accepted

Thanks in advance for any answers

Upvotes: 1

Views: 47

Answers (2)

Valeri Voev
Valeri Voev

Reputation: 2242

I'd like to offer an alternative using case_when from dplyr which offers a nice and intuitive syntax:

library(dplyr)
df <- structure(list(species = c("Tilapia guineensis", "Tilapia zillii", 
                                                                 "Fundulus rubrifrons", "Eutrigla gurnardus", "Sprattus sprattus", 
                                                                 "Gadus morhua"), valid_name = c(NA, "Hippocampus trimaculatus", 
                                                                                                                                "Hippocampus trimaculatus", "Bougainvillia supercili", NA, 
                                                                                                                                "Aglantha digitale"
                                                                 ), status = c(NA, "unaccepted", "unaccepted", "accepted", NA, 
                                                                                            "accepted")), class = "data.frame", row.names = c(NA, -6L))

df <- df %>% 
    mutate(new_name = case_when(
        status == "unaccepted" ~ valid_name,
        status == "accepted" | is.na(status) ~ species
    ))

Upvotes: 0

akrun
akrun

Reputation: 887088

If we use ==, make sure to also add is.na to return TRUE/FALSE, otherwise, the NAs remain as NA

library(dplyr)
df%>%
  mutate(new_name = ifelse(status=="unaccepted" & !is.na(status),valid_name,
           ifelse(status=="accepted" & !is.na(status),species,species)))
#      species               valid_name     status                 new_name
#1  Tilapia guineensis                     <NA>       <NA>       Tilapia guineensis
#2      Tilapia zillii Hippocampus trimaculatus unaccepted Hippocampus trimaculatus
#3 Fundulus rubrifrons Hippocampus trimaculatus unaccepted Hippocampus trimaculatus
#4  Eutrigla gurnardus  Bougainvillia supercili   accepted       Eutrigla gurnardus
#5   Sprattus sprattus                     <NA>       <NA>        Sprattus sprattus
#6        Gadus morhua        Aglantha digitale   accepted             Gadus morhua

Another option is to use %in% which will return FALSE for NA

df%>%
  mutate(new_name = ifelse(status %in% "unaccepted" ,valid_name,
           ifelse(status %in% "accepted",species, species)))

using a reproducible example

v1 <- c('a', 'b', NA)
v1 == 'a'
#[1]  TRUE FALSE    NA  ####

v1 %in% 'a'
#[1]  TRUE FALSE FALSE

data

df <- structure(list(species = c("Tilapia guineensis", "Tilapia zillii", 
"Fundulus rubrifrons", "Eutrigla gurnardus", "Sprattus sprattus", 
"Gadus morhua"), valid_name = c(NA, "Hippocampus trimaculatus", 
"Hippocampus trimaculatus", "Bougainvillia supercili", NA, 
"Aglantha digitale"
), status = c(NA, "unaccepted", "unaccepted", "accepted", NA, 
"accepted")), class = "data.frame", row.names = c(NA, -6L))

Upvotes: 1

Related Questions