Reputation: 1076
I would like to create all possible pairs between rows of a dataframe without duplicates (i.e. A_B is the same as B_A).
Is there an elegant way to do this in tidyverse?
Example data:
df <- tibble(
id = 1:5,
name = c( 'Alice', 'Bob', 'Charlie', 'Diane', 'Fred' )
)
Expected output:
> df_pairs
# A tibble: 10 x 2
id name
<chr> <chr>
1 1_2 Alice_Bob
2 1_3 Alice_Charlie
3 1_4 Alice_Diane
4 1_5 Alice_Fred
5 2_3 Bob_Charlie
6 2_4 Bob_Diane
7 2_5 Bob_Fred
8 3_4 Charlie_Diane
9 3_5 Charlie_Fred
10 4_5 Diane_Fred
I was able to do it with crossing, but I'd like to know if there is an easier way:
df_pairs <- df %>% select( id1 = id, name1 = name ) %>%
crossing(df %>% select(id2 = id, name2 = name) ) %>%
dplyr::filter( id1 < id2) %>%
unite( id, id1, id2 ) %>%
unite( name, name1, name2 )
Upvotes: 3
Views: 423
Reputation: 388982
Looks like you need to use combn
to avoid duplicates.
get_combn <- function(x) {
combn(x, 2, paste, collapse = "_")
}
as.data.frame(lapply(df, get_combn))
# id name
#1 1_2 Alice_Bob
#2 1_3 Alice_Charlie
#3 1_4 Alice_Diane
#4 1_5 Alice_Fred
#5 2_3 Bob_Charlie
#6 2_4 Bob_Diane
#7 2_5 Bob_Fred
#8 3_4 Charlie_Diane
#9 3_5 Charlie_Fred
#10 4_5 Diane_Fred
which can also be applied with purrr::map_df
purrr::map_df(df, get_combn)
Upvotes: 3