Dave
Dave

Reputation: 474

How to create all possible pair combinations without duplicates in Google Sheets?

How to perform iteration over excel/google sheets cells to get pairwise combinations?

"string1"

"string2"
"string3"
...
"string10"

I'm looking at writing a function that can iterate over these strings to create the following:

"string1, string2" 
"string1, string 3" 
...
"string 1, string 10" 
"string 2, string 3" 
...
"string 2, string 10" 
"string3, string 4" 
... ... 
"string9 string10".

Is this possible in google sheets?

Upvotes: 1

Views: 4356

Answers (7)

player0
player0

Reputation: 1

=ARRAYFORMULA(SPLIT(SORT(
 TRANSPOSE(SPLIT(CONCATENATE(REPT(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))&","&
 TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))), (
 UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))<=
 TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))))*
 REGEXMATCH(CONCATENATE(","&SUBSTITUTE(TEXTJOIN(",",1,A2:A),",",",,")&","&CHAR(9)),"(,"&
 UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))&",[^\t]*,"&
 TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),","))))&",)|(,"&
 TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),","))))&",[^\t]*,"&
 UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))&",)"))&CHAR(9)),CHAR(9)))),","))

0

for more see: https://stackoverflow.com/a/68775825/5632629

Upvotes: 2

player0
player0

Reputation: 1

=QUERY(ARRAYFORMULA(SPLIT(
 TRANSPOSE(SPLIT(REPT(CONCATENATE(A2:A&CHAR(9)), COUNTA(A2:A)),  CHAR(9)))& " "&
 TRANSPOSE(SPLIT(CONCATENATE(REPT(A2:A&CHAR(9),  COUNTA(A2:A))), CHAR(9))), " ")),
 "where Col1<>Col2 order by Col1", 0)

for more see: https://stackoverflow.com/a/68775825/5632629

Upvotes: 1

stwheel1
stwheel1

Reputation: 1

With the items to combine in named range "items":

=let(x,tocol(let(i,filter(items,items<>""),t,transpose(filter(items,items<>"")),arrayformula(if(match(i,items,0)>=match(t,items,0),,i&t)))),filter(x,x<>""))

Upvotes: 0

doubleunary
doubleunary

Reputation: 18698

The question seems to describe a special case of a Cartesian product of two sets where the sets are identical while removing results that have the same value in both tuple values.

How about a humble three-step solution:

all combos:

=arrayformula( split( 
  tocol(A2:A5 & "→" & transpose(A2:A5)), 
  "→" 
) )

filter out rows where the two strings are the same:

=filter(B2:C, B2:B <> C2:C)

join with comma:

=arrayformula( 
  if( 
    len(D2:D), 
    D2:D & ", " & E2:E, 
    iferror(ø) 
  ) 
)
source data all combos filter join Cartesian solution
string1 string1 string1 string1 string2 string1, string2 string1, string2
string2 string1 string2 string1 string3 string1, string3 string1, string3
string3 string1 string3 string1 string4 string1, string4 string1, string4
string4 string1 string4 string2 string1 string2, string1 string2, string1
string2 string1 string2 string3 string2, string3 string2, string3
string2 string2 string2 string4 string2, string4 string2, string4
string2 string3 string3 string1 string3, string1 string3, string1
string2 string4 string3 string2 string3, string2 string3, string2
string3 string1 string3 string4 string3, string4 string3, string4
string3 string2 string4 string1 string4, string1 string4, string1
string3 string3 string4 string2 string4, string2 string4, string2
string3 string4 string4 string3 string4, string3 string4, string3
string4 string1
string4 string2
string4 string3
string4 string4

The same can be done in one step with a recursive Cartesian that has an if() in the inner reduce():

Cartesian solution:

=let( 
  data, A2:A, 
  column, unique(filter(data, len(data))), 
  table, { column, column }, 
  blank, iferror(1/0), 
  first_, lambda(array, tocol(choosecols(array, 1), true)), 
  rest_, lambda(n, choosecols(table, sequence(1, columns(table) - n, n + 1))), 
  wrap_, lambda(array, wrapCount, wraprows(tocol(array, true), wrapCount)), 

  cartesian_, lambda(a, b, wrap_( 
    byrow(a, lambda(row, 
      reduce(blank, sequence(rows(b)), lambda(acc, i, 
        if( 
          row = chooserows(b, i), 
          acc, 
          { acc, row, chooserows(b, i) } 
        ) 
      ) ) 
    ) ), 
    columns(a) + columns(b) 
  ) ), 

  iterate_, lambda( 
    self, a, b, if(iserror(b), a, 
      self(self, cartesian_(a, first_(b)), rest_(columns(a) + 1)) 
    ) 
  ), 

  result, iterate_(iterate_, first_(table), rest_(1)), 
  byrow(result, lambda(row, join(", ", row))) 
)

Upvotes: 0

TheMaster
TheMaster

Reputation: 50382

Given the advantage of newer functions like REDUCE, it is possible to loop. This is basically a implementation of two loops mentioned here. The second one - one minus the first one.

=QUERY(LAMBDA(mrg,REDUCE({"Round Robin"&CHAR(10)&"(no order)",HYPERLINK("https://stackoverflow.com/a/73901481","-TheMaster")},SEQUENCE(ROWS(mrg)-1,1,0),LAMBDA(a_,c_,{a_;LAMBDA(rg,REDUCE({"",""},OFFSET(rg,1,0,ROWS(rg)-1),LAMBDA(a,c,{a;{INDEX(rg,1),c}})))(OFFSET(mrg,c_,0,ROWS(mrg)-c_))})))(A1:A5),"where Col1 is not null")

where A1:A5 is the list of 5 players.

Round Robin (no order/rotate) -TheMaster
Player 1 Player 2
Player 1 Player 3
Player 1 Player 4
Player 1 Player 5
Player 2 Player 3
Player 2 Player 4
Player 2 Player 5
Player 3 Player 4
Player 3 Player 5
Player 4 Player 5

Upvotes: 2

Tom Sharpe
Tom Sharpe

Reputation: 34180

I have to agree with @Max that it is difficult with native functions, or at least long-winded, but it is possible in Google Sheets

=ArrayFormula(query({if((row(A:A)<=counta(A:A)^2)*(int((row(A:A)-1)/counta(A:A))<mod((row(A:A)-1),counta(A:A))),
vlookup(int((row(A:A)-1)/counta(A:A)),{row(A:A)-1,A:A},2)&vlookup(mod((row(A:A)-1),counta(A:A)),{row(A:A)-1,A:A},2),"")},"select Col1 where Col1<>''"))

enter image description here

Note 1 - method

Using a list of 10 strings as an example.

(1) Add a column to number the strings from 0 to 9 using

{row(A:A)-1,A:A}

(2) Use the row number in a VLOOKUP to get the first string of the pair with

vlookup(int((row(A:A)-1)/counta(A:A)),{row(A:A)-1,A:A},2)

Row number-1 int((row(A:A)-1)/counta(A:A))  String

0            0                              String1

1            0                              String1

...

9            0                              String1

10           1                              String2

...

20           2                              String3

...

99           9                              String10

(3) Use the row number in a VLOOKUP to get the second string of the pair with

vlookup(mod((row(A:A)-1),counta(A:A)),{row(A:A)-1,A:A},2)

Row number-1  mod((row(A:A)-1),counta(A:A)) String

0             0                             String1

1             1                             String2

2             2                             String3

...

9             9                             String10

10            0                             String1

11            1                             String2

...

99            9                             String10

Note that the list will include unwanted pairs like String1String1 and String2String1.

(4) Set unwanted pairs to "" with if condition

if((row(A:A)<=counta(A:A)^2)*(int((row(A:A)-1)/counta(A:A))

Note 1 Using a filter to remove unwanted pairs as suggested by @Max Makhrov would be shorter.

(5) Use Query to remove blank rows.

Note 2 - limitation on number of rows

Because redundant pairs are generated then removed, this method requires N^2 rows to be in the sheet where N is the number of strings rather than N*(N-1)/2 which is the number of distinct pairs of N objects. Conversely, the maximum number of strings s which can be processed this way for a sheet with N rows is floor(sqrt(N)), e.g. for a sheet with 1,000 rows s=floor(sqrt(1000))=31.

Note 3 - a possible way to avoid generating redundant pairs

One way of visualising what I have tried to do is as follows, where the array elements represent output rows (A:A) and the row and column headers indicate corresponding values which are used as lookups to get pairs like (string 1, string 1), (string 1 string 2) etc.

enter image description here

It is fairly easy to do the mapping from output rows to lookup values using integer division and the MOD function as above.

What we would really like to do is to get non-redundant pairs like this

enter image description here

but then how would you map from output rows 1-10 to pairs of lookup values 1-5 ?

I hope to show that this is possible with a bit of maths providing (at least in principle) a way to get the N(N-1)/2 non-redundant pairs straight away without first generating all N^2 pairs.

The count S of cells in rows 1 to r of the upper triangular part above is the total count N(N-1)/2 minus the count in the rows below it (N-r)(N-r-1)/2

enter image description here

This can be re-arranged as follows

enter image description here

This is a quadratic in r so we can solve it using the regular formula

enter image description here

to give

enter image description here

So the row is given by the ceiling of the above formula for r.

The number (say T) at the end of row r is given by substituting the ceiling of r back into the second equation above

enter image description here

and finally the column corresponding to S is given by

enter image description here

Now define a named range N whose value is

=counta(A:A)

and a named range M whose value is

=2*N-1

Then finally the formula you need to select stringA (the row r of the matrix) is

=iferror(ArrayFormula(vlookup(ceiling((M-sqrt(M^2-8*row(A:A)))/2,1),{row(A:A),A:A},2)),"")

and the formula you need to select stringB (the column c of the matrix) is

=iferror(ArrayFormula(vlookup(N+row(A:A)-(M*CEILING((M-SQRT(M^2-8*row(A:A)))/2,1)-CEILING((M-SQRT(M^2-8*row(A:A)))/2,1)^2)/2,{row(A:A),A:A},2)),"")

enter image description here

where columns D and E are just included for testing purposes.

Then it only remains to combine the two formulas into one column if desired.

Upvotes: 2

Max Makhrov
Max Makhrov

Reputation: 18707

It is a hard task for native functions. Try a script and use it as a custom function:

function getTournament(teams_from_range)

    {
      // teams_from_range -- 2D Array  
      var teams = [];
      // convert to list
      teams_from_range.forEach(function(row) { row.forEach(function(cell) { teams.push(cell); } ); } );
      return getTournament_(teams);
    }
    
    
    function getTournament_(teams)
    {
      var start = 0;
      var l = teams.length;
      var result = [], game = [];
      
      // loop each value
      for (var i = 0; i < l; i++)
      {
        // loop each value minus current
        start++;
        for (var ii = start; ii < l; ii++)
        {
          game = []
          game.push(teams[i]);
          game.push(teams[ii]);  
          result.push(game);
        }  
      }
      
      return result;
    
    }

Usage:

=getTournament(A1:A10)

Upvotes: 2

Related Questions