combn-like task for combinations as elements in R

Question

not sure if I chose a good title at all... Also I don't know if I use the right terminology, so maybe with the right search terms I would find a solution to this problem...

I have a list of strings from which I want to have all sets of "exclusive" combinations of 3.

Example: With the following

require(utils)
mylist<-c("strA","strB","strC","strD","strE","strF")
t(combn(mylist,3))

I get a table that lists all possible combinations of 3 out of those 6 strings (so each row represents one combination of 3):

        [,1]   [,2]   [,3]  
   [1,] "strA" "strB" "strC"
   [2,] "strA" "strB" "strD"
   [3,] "strA" "strB" "strE"
   [4,] "strA" "strB" "strF"
   [5,] "strA" "strC" "strD"
   [6,] "strA" "strC" "strE"
   [7,] "strA" "strC" "strF"
   [8,] "strA" "strD" "strE"
   [9,] "strA" "strD" "strF"
  [10,] "strA" "strE" "strF"
  [11,] "strB" "strC" "strD"
  [12,] "strB" "strC" "strE"
  [13,] "strB" "strC" "strF"
  [14,] "strB" "strD" "strE"
  [15,] "strB" "strD" "strF"
  [16,] "strB" "strE" "strF"
  [17,] "strC" "strD" "strE"
  [18,] "strC" "strD" "strF"
  [19,] "strC" "strE" "strF"
  [20,] "strD" "strE" "strF"

but I would like to have all sets of combinations of 3, in which each string only appears once. So my desired output would look sth like this:

$1
        [,1]   [,2]   [,3]  
   [1,] "strA" "strB" "strC"
   [2,] "strD" "srtE" "strF"
$2
        [,1]   [,2]   [,3]  
   [1,] "strA" "strB" "strD"
   [1,] "strC" "strE" "strF"
$3
        [,1]   [,2]   [,3]  
   [1,] "strA" "strB" "strE"
   [1,] "strC" "strD" "strF"
...

So here each subelement ($1, $2, $3, etc.) contains 2 combinations of 3 strings (as 2*3=6; with 6 strings). In each set each string must not appear more than once.

Of course it would be great if this would be also possible for lengths of mylist that are not a multiple of n=3. In case we have let's say 10 strings (with the addition of "strG", "strH", "strI", and "strJ"), I would like to have one string ommitted in each combination. So the desired result would be sth like

$1
        [,1]   [,2]   [,3]  
   [1,] "strA" "strB" "strC"
   [2,] "strD" "strE" "strF"
   [3,] "strG" "strH" "strI"
$2
        [,1]   [,2]   [,3]  
   [1,] "strA" "strB" "strC"
   [2,] "strD" "strE" "strF"
   [3,] "strG" "strH" "strJ"
$3
        [,1]   [,2]   [,3]  
   [1,] "strA" "strB" "strC"
   [2,] "strD" "strE" "strF"
   [3,] "strG" "strI" "strJ"
$4
        [,1]   [,2]   [,3]  
   [1,] "strA" "strB" "strC"
   [2,] "strE" "strF" "strG"
   [3,] "strH" "strI" "strJ"
...

Does someone have a solution to this? If my explanation was unclear, please let me know.

Cheers

IRTFM · Accepted Answer

Will assume the transposed combo matrix is named mat. Check to see if there is any overlap with length applied to results of intersect function:

 res <- list();
 for (i in 1:nrow(mat) ){
    for( j in 1:nrow(mat)){  
          if( !length(intersect(mat[i,] , mat[j,])) ) 
               res[[paste(i,j,sep="_")]] <- rbind( mat[i,], mat[j, ]) } }


> res
$`1_20`
     [,1]   [,2]   [,3]  
[1,] "strA" "strB" "strC"
[2,] "strD" "strE" "strF"

$`2_19`
     [,1]   [,2]   [,3]  
[1,] "strA" "strB" "strD"
[2,] "strC" "strE" "strF"

$`3_18`
     [,1]   [,2]   [,3]  
[1,] "strA" "strB" "strE"
[2,] "strC" "strD" "strF"

.... snipped

Depending on your definition of "unique" you might decide to only take the first ten items since half of these are transpositions of rows:

> res[[1]]
     [,1]   [,2]   [,3]  
[1,] "strA" "strB" "strC"
[2,] "strD" "strE" "strF"
> res[[20]]
     [,1]   [,2]   [,3]  
[1,] "strD" "strE" "strF"
[2,] "strA" "strB" "strC"

combn-like task for combinations as elements in R

Answers (2)

Related Questions