Reputation: 327
I have a data frame that looks like the following:
tibble(term = c(
rep("a:b", 2),
rep("b:a", 2),
rep("c:d", 2),
rep("d:c", 2),
rep("g:h", 2),
rep("h:g", 2)
))
I would like to add an extra column in this data frame that takes on the same value for any pair that have the same characters but reversed and separated by a ":" (i.e. a:b and b:a would be codded the same way; similar for c:d and d:c and all the other pairs).
I thought of something like the following:
%>%
mutate(term_adjusted = case_when(grepl("a:b|b:a", term) ~ "a:b"))
but I have a large number of these pairs in my dataset and would like a way to automate that, hence my question:
How can I do this operation automatically without having to hard code for each pair separately?
Thank you!
Upvotes: 1
Views: 64
Reputation: 388982
tidyverse
option -
library(dplyr)
library(tidyr)
df %>%
separate(term, c('term1', 'term2'), sep = ':', remove = FALSE) %>%
mutate(col1 = pmin(term1, term2), col2 = pmax(term1, term2)) %>%
unite(result, col1, col2, sep = ':') %>%
select(term, result)
# term result
# <chr> <chr>
# 1 a:b a:b
# 2 a:b a:b
# 3 b:a a:b
# 4 b:a a:b
# 5 c:d c:d
# 6 c:d c:d
# 7 d:c c:d
# 8 d:c c:d
# 9 g:h g:h
#10 g:h g:h
#11 h:g g:h
#12 h:g g:h
Upvotes: 1
Reputation: 2626
How about:
libary(dplyr)
your_data %>%
mutate(term_adjusted = term %>%
strsplit(":") %>%
purrr::map_chr(~ .x %>%
sort() %>%
paste(collapse = ":")))
Base R option
your_data$term_adjusted <- your_data$term |>
strsplit(":") |>
lapply(sort) |>
lapply(paste, collapse = ":") |>
unlist()
Either returns:
# A tibble: 12 x 2
term term_adjusted
<chr> <chr>
1 a:b a:b
2 a:b a:b
3 b:a a:b
4 b:a a:b
5 c:d c:d
6 c:d c:d
7 d:c c:d
8 d:c c:d
9 g:h g:h
10 g:h g:h
11 h:g g:h
12 h:g g:h
Upvotes: 3