Chris
Chris

Reputation: 515

splitstring and obtain all combinations of substrings after split value with only 1 item per combination can come from value before split

Here's the same task we solved in Python. I've tried utilizing a similar approach of creating an empty dictionary from pre-split (R's strsplit) keys and unpacking all corresponding post-split strings as values. Then, next step is to create all combinations but no more than one pre-split string can exist in the resulting combinations.

Here is my input list:

list('ROOM1-abc',
'ROOM1-def',
'ROOM2-abc',
'ROOM2-lol',
'ROOM3-whatever')

And the desired output (with 2-length combinations (needs to be able to pick the length of combination elements returned)):

['ROOM1-abc', 'ROOM2-lol'],
['ROOM1-abc', 'ROOM3-whatever'],
['ROOM1-def', 'ROOM2-abc'],
['ROOM1-def', 'ROOM2-lol'],
['ROOM1-def', 'ROOM3-whatever'],
['ROOM2-abc', 'ROOM3-whatever'],
['ROOM2-lol', 'ROOM3-whatever']]

I'm struggling with the sub-item list indexing syntax in Python vs. R as well as having to learn R for a specific need on a problem we've solved already via Python .

Upvotes: 0

Views: 206

Answers (2)

user11538509
user11538509

Reputation:

If I get it right what you want to do is

df <- expand.grid(unlist(lst1), unlist(lst1))
df
             Var1           Var2
1       ROOM1-abc      ROOM1-abc
2       ROOM1-def      ROOM1-abc
3       ROOM2-abc      ROOM1-abc
4       ROOM2-lol      ROOM1-abc
5  ROOM3-whatever      ROOM1-abc
6       ROOM1-abc      ROOM1-def
7       ROOM1-def      ROOM1-def
8       ROOM2-abc      ROOM1-def
9       ROOM2-lol      ROOM1-def
10 ROOM3-whatever      ROOM1-def
11      ROOM1-abc      ROOM2-abc
12      ROOM1-def      ROOM2-abc
13      ROOM2-abc      ROOM2-abc
14      ROOM2-lol      ROOM2-abc
15 ROOM3-whatever      ROOM2-abc
16      ROOM1-abc      ROOM2-lol
17      ROOM1-def      ROOM2-lol
18      ROOM2-abc      ROOM2-lol
19      ROOM2-lol      ROOM2-lol
20 ROOM3-whatever      ROOM2-lol
21      ROOM1-abc ROOM3-whatever
22      ROOM1-def ROOM3-whatever
23      ROOM2-abc ROOM3-whatever
24      ROOM2-lol ROOM3-whatever
25 ROOM3-whatever ROOM3-whatever

This gives a matrix with all possible combinations. So the difference to the sugegstion of akrun is that this also gives you a combination with the element itself, e.g. ROOM1-abc | ROOM1-abc and cares about order, hence gives you for example ROOM3-whatever | ROOM1-abc and ROOM3-whatever | ROOM1-abc.

If you do not care about order you can remove rows with duplicate

df[!duplicated(t(apply(df, 1, sort))), ]
             Var1           Var2
1       ROOM1-abc      ROOM1-abc
2       ROOM1-def      ROOM1-abc
3       ROOM2-abc      ROOM1-abc
4       ROOM2-lol      ROOM1-abc
5  ROOM3-whatever      ROOM1-abc
7       ROOM1-def      ROOM1-def
8       ROOM2-abc      ROOM1-def
9       ROOM2-lol      ROOM1-def
10 ROOM3-whatever      ROOM1-def
13      ROOM2-abc      ROOM2-abc
14      ROOM2-lol      ROOM2-abc
15 ROOM3-whatever      ROOM2-abc
19      ROOM2-lol      ROOM2-lol
20 ROOM3-whatever      ROOM2-lol
25 ROOM3-whatever ROOM3-whatever

EDIT

# splits at "-"
split <- strsplit(unlist(lst1), "-")
# adds "-" to each vector
split2 <- lapply(split, function(x){
  c(x[1], "-", x[2])})
# saves everything as a dataframe (if desired)
do.call("cbind.data.frame", split2)

Upvotes: 1

akrun
akrun

Reputation: 887098

An option is to do combn on the list and return as a list of vectors

library(tidyverse)
combn(lst1, 2, simplify = FALSE) %>%
       map(flatten_chr)

data

lst1 <- list('ROOM1-abc',
'ROOM1-def',
'ROOM2-abc',
'ROOM2-lol',
'ROOM3-whatever')

Upvotes: 0

Related Questions