dplyr package - mutate

Question

Good night! I'm brazilian and I don't speak english very well. I work a database with more than 10000 rows, represented by the example below.

df <- data.frame(
    PROCESS = c(180022121, 180022121, 180022105, 180022105, 180022097, 180022097, 180022097, 180022501, 180022501), 
    NAME = c("A_NONIMATO", "B_NONIMATO", "C_NONIMATO", "C_NONIMATO", "D_NONIMATO", "E_NONIMATO", "F_NONIMATO", "G_NONIMATO", "G_NONIMATO"),DATE = c("02/01/2018", "02/01/2018", "01/01/2018", "01/01/2018", "01/01/2018", "01/01/2018", "01/01/2018", "02/01/2018", "02/01/2018"), 
    CRIME = c("ART.33", "ART.33", "ART.35", "ART.33", "ART.155", "ART.155", "ART.155", "ART.157", "ART.14CP"))

HYPOTHESIS:

a) same PROCESS, different PARTNAME, same CRIME (lines 1 and 2)

b) same PROCESS, same PARTNAME, different CRIME (lines 3 and 4)

c) same PROCESS, same PARTNAME, different CRIME (lines 8 and 9)

My question concerns hypothesis c: I need to create a column CRIME2 to transfer ART.14CP turning the case into just one line… without moving to hypothesis b. Throughout the database will always be written ART.14CP.

To look like this:

df2 <- data.frame(
    PROCESS = c(180022121, 180022121, 180022105, 180022105, 180022097, 180022097, 180022097, 180022501),
    NAME = c("A_NONIMATO", "B_NONIMATO", "C_NONIMATO", "C_NONIMATO", "D_NONIMATO", "E_NONIMATO", "F_NONIMATO", "G_NONIMATO"),
    DATE = c("02/01/2018", "02/01/2018", "01/01/2018", "01/01/2018", "01/01/2018", "01/01/2018", "01/01/2018", "02/01/2018"),
    CRIME = c("ART.33", "ART.33", "ART.35", "ART.33", "ART.155", "ART.155", "ART.155", "ART.157"),
    CRIME2 = c("", "", "", "", "", "", "", "ART.14CP"))

Thank you very much for your attention.

GenesRus · Accepted Answer

Here's a slight modification of Ronak's answer using case_when and another way of approaching the logic. I prefer case_when to if or ifelse since it's more intuitive to others reading your code, especially if you need to use multiple conditions or have numerous cases. The TRUE condition will capture any remaining cases; if you don't put it, those cases be set to NA by default.

library(dplyr)
df %>% 
  group_by(PROCESS, NAME) %>%
  mutate(CRIME2 = case_when(
    "ART.14CP" %in% CRIME ~ "ART.14CP",
    TRUE ~ ""
  )) %>% 
  filter(CRIME != "ART.14CP")

dplyr package - mutate

Answers (2)

Related Questions