user11036517
user11036517

Reputation: 55

Using mutate+case_when to create a new column by comparing the value of a column in a df to a set vectors

I am trying to use dplyr::mutate to create a new column in a df based on a couple of conditions, but I am getting back an error telling me that the objects I am comparing (vectors vs df$column) have unequal lengths. Here's the error message:

Error in mutate(): ! Problem while computing Analye_Class = case_when(...). Caused by error in names(message) <- `*vtmp*`: ! 'names' attribute [1] must be the same length as the vector [0] Run rlang::last_error() to see where the error occurred.

I figured out how to do so using the %in% operator and would like to find out how to achieve the same result using dplyr:mutate.

I started by defining two vectors called, "Class1" and "Class2" to asign each observation to class based on wheter the value of the "Analyte" column in found on either of the vectors.

Below is my code:

library(dplyr)

Class1 <- c("a","b")
Class2 <- c("c","d")
df <- data.frame (Site = c("A","A","B","B","C","C"),
            Analyte = c("a","b","a","b","c","d"),
            Class = rep(c("NA"),6))

#If value of "Analyte" column is found in vectors "Class1" or "Class2" then 
#set value of "Class" column to name of corresponding vector, using %in% operator

df1 <- within(df,{
  Class=NA
  Class[Analyte %in% Class1]="Class1"
  Class[Analyte %in% Class2]="Class2"
})
df1

#If value of "Analyte" column is found in vectors "Class1" or "Class2" then 
#set value of "Class" column to name of corresponding vector, using dplyr::mutate produces 
#unequal length error message

df2 <- df%>%dplyr::mutate(Analye_Class = case_when(
  Analyte %in% Class1 ~ "Class1",
  Analyte %in% Class2 ~ "Class2",
  TRUE~NA
))
df2

Thank you for your help.

Upvotes: 1

Views: 570

Answers (1)

Shafee
Shafee

Reputation: 19897

Simply use NA_character_ instead of NA

library(dplyr)

Class1 <- c("a","b")
Class2 <- c("c","d")
df <- data.frame (Site = c("A","A","B","B","C","C"),
                  Analyte = c("a","b","a","b","c","d"),
                  Class = rep(c("NA"),6))


df2 <- df %>% dplyr::mutate(Analye_Class = case_when(
    Analyte %in% Class1 ~ "Class1",
    Analyte %in% Class2 ~ "Class2",
    TRUE ~ NA_character_
))

df2

#>   Site Analyte Class Analye_Class
#> 1    A       a    NA       Class1
#> 2    A       b    NA       Class1
#> 3    B       a    NA       Class1
#> 4    B       b    NA       Class1
#> 5    C       c    NA       Class2
#> 6    C       d    NA       Class2

Created on 2022-07-11 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions