Reputation: 6874
Data
I have a dataframe as follows:
structure(list(EndoscopyEventRaw = c("", "", "oesophagus:rfa;oesophagus:nac",
"oesophagus:rfa;oesophagus:nac", "oesophagus:brushings", "oesophagus:rfa;oesophagus:emr;oesophagus:nac",
"oesophagus:apc", "oesophagus:apc;oesophagus:nac", "oesophagus:apc",
"")), row.names = c(NA, 10L), class = "data.frame")
Aim
I'd like to extract aspects of this into a new column, possibly using case_when
based on rules as follows:
dataframe<-dataframe %>% mutate(OPCS4ZCode2 = case_when(
grepl("nac",EndoscopyEventRaw)~ "CodeForNAC",
grepl("apc",EndoscopyEventRaw) ~ "CodeForAPC",
grepl("rfa",EndoscopyEventRaw) ~ "CodeForRFA",
grepl("grasp",EndoscopyEventRaw) ~ "CodeForGrasp"
),
TRUE ~ ""
)
Problem and desired result
However, some of the rows have more than one element that code be codified into the new column so that the end result should be:
1
2
3 CodeForRFA,CodeForNAC
4 CodeForRFA,CodeForNAC
5
6 CodeForRFA,CodeForNAC
7 CodeForAPC
8 CodeForAPC,CodeForNAC
9 CodeForAPC
10
When I use case_when
it stops looking when the first match is found. Is there a way of matching all the targets as above with or without using case_when
?
Upvotes: 0
Views: 47
Reputation: 388807
Yes, case_when
skips other conditions once a match is found. One way is to separate the data into different rows and then use the conditions with case_when
and summarise the data back.
library(dplyr)
df %>%
mutate(row = row_number()) %>%
tidyr::separate_rows(EndoscopyEventRaw, sep = ";") %>%
mutate(OPCS4ZCode2 = case_when(grepl("nac",EndoscopyEventRaw)~ "CodeForNAC",
grepl("apc",EndoscopyEventRaw) ~ "CodeForAPC",
grepl("rfa",EndoscopyEventRaw) ~ "CodeForRFA",
grepl("grasp",EndoscopyEventRaw) ~ "CodeForGrasp",
TRUE ~ "")) %>%
group_by(row) %>%
summarise(OPCS4ZCode2 = toString(OPCS4ZCode2)) %>%
select(-row)
# A tibble: 10 x 1
# OPCS4ZCode2
# <chr>
# 1 ""
# 2 ""
# 3 CodeForRFA, CodeForNAC
# 4 CodeForRFA, CodeForNAC
# 5 ""
# 6 CodeForRFA, , CodeForNAC
# 7 CodeForAPC
# 8 CodeForAPC, CodeForNAC
# 9 CodeForAPC
#10 ""
Upvotes: 2