Lia_G
Lia_G

Reputation: 175

How can I assign a value with case_when() from dplyr based on another column value?

Is there a way to assign the value of the column being created using an existing value from another column when using case_when() with mutate()?

The actual dataframe I'm dealing with is quite complicated so here is a trivial example of what I want:

library(dplyr)

df = tibble(Assay = c("A", "A", "B", "C", "D", "D"), 
            My_ID = c(3, 12, 36, 5, 13, 1), 
            Modifier = c(12, 6,  5, 9, 3, 6)) 

new_df = df %>% 
  mutate(Assay = case_when(
    My_ID == 5 ~ "C/D",
    My_ID == 12  ~ "Rm",
    My_ID == 13 | My_ID == 3  ~ Modifier * 3,
    TRUE ~ Assay)) %>% 
    select(-Modifier)

Expected new_df:

# A tibble: 6 x 2
  Assay My_ID
  <chr> <dbl>
1 36        3
2 Rm       12
3 B        36
4 C/D       5
5 9        13
6 D         1

I can successfully assign the NA values to the column I am mutating when no cases match, but haven't found a way to assign a value based on the value of some other column in the data frame if I'm manipulating it. I get this error:

Error: Problem with `mutate()` column `Assay`.
i `Assay = case_when(...)`.
x must be a character vector, not a double vector.

Is there a way to do this?

Upvotes: 3

Views: 6722

Answers (1)

Lia_G
Lia_G

Reputation: 175

I found that I was able to do this using paste() after experimenting. As noted by a commenter, paste() works because the underlying issue here is an object type issue. The Assay column is a character vector, but the modification includes an integer. The function paste() implicitly converts to a character. The function paste0() will fix the problem, but using as.character() directly addresses the issue.

library(dplyr)

df = tibble(Assay = c("A", "A", "B", "C", "D", "D"), 
            My_ID = c(3, 12, 36, 5, 13, 1), 
            Modifier = c(12, 6,  5, 9, 3, 6)) 

new_df = df %>% 
  mutate(Assay = case_when(
    My_ID == 5 ~ "C/D",
    My_ID == 12  ~ "Rm",
    My_ID == 13 | My_ID == 3  ~ as.character(Modifier * 3),
    TRUE ~ Assay)) %>% 
  select(-Modifier)

This is the output:

 print(new_df)
# A tibble: 6 x 2
  Assay My_ID
  <chr> <dbl>
1 36        3
2 Rm       12
3 B        36
4 C/D       5
5 9        13
6 D         1 

Upvotes: 4

Related Questions