How can I automate this simple conditional column operation in R?

Question

I have a data frame that looks like the following:

tibble(term = c(
  rep("a:b", 2),
  rep("b:a", 2),
  rep("c:d", 2),
  rep("d:c", 2),
  rep("g:h", 2),
  rep("h:g", 2)
))

I would like to add an extra column in this data frame that takes on the same value for any pair that have the same characters but reversed and separated by a ":" (i.e. a:b and b:a would be codded the same way; similar for c:d and d:c and all the other pairs).

I thought of something like the following:

%>%
  mutate(term_adjusted = case_when(grepl("a:b|b:a", term) ~ "a:b"))

but I have a large number of these pairs in my dataset and would like a way to automate that, hence my question:

How can I do this operation automatically without having to hard code for each pair separately?

Thank you!

ktiu · Accepted Answer

How about:

libary(dplyr)

your_data %>%
  mutate(term_adjusted = term %>%
                           strsplit(":") %>%
                           purrr::map_chr(~ .x %>%
                                           sort() %>%
                                           paste(collapse = ":")))

Base R option

your_data$term_adjusted <- your_data$term |>
                             strsplit(":") |>
                             lapply(sort) |>
                             lapply(paste, collapse = ":") |>
                             unlist()

Either returns:

# A tibble: 12 x 2
   term  term_adjusted
    
 1 a:b   a:b
 2 a:b   a:b
 3 b:a   a:b
 4 b:a   a:b
 5 c:d   c:d
 6 c:d   c:d
 7 d:c   c:d
 8 d:c   c:d
 9 g:h   g:h
10 g:h   g:h
11 h:g   g:h
12 h:g   g:h

How can I automate this simple conditional column operation in R?

Answers (2)

Related Questions