Tom el Safadi
Tom el Safadi

Reputation: 6746

T-SQL Return All Combinations of a Table

In a different question, I asked about all possible 3-way combinations of a table, assuming that there are 3 articles. In this question, I would like to further extend the problem to return all n-way combinations of a table with n distinct articles. One article can have multiple suppliers. My goal is to have groups of combinations with each group having each article.

Below is a sample table but keep in mind that there could be more than those 3 articles.

+---------+----------+
| Article | Supplier |
+---------+----------+
|    4711 | A        |
|    4712 | B        |
|    4712 | C        |
|    4712 | D        |
|    4713 | C        |
|    4713 | E        |
+---------+----------+

For the example above, there would be 6 combination pairs possible with 18 datasets. Below is how the solution for the example above is supposed to look like:

+----------------+---------+----------+
| combination_nr | article | supplier |
+----------------+---------+----------+
|              1 |    4711 | A        |
|              1 |    4712 | B        |
|              1 |    4713 | C        |
|              2 |    4711 | A        |
|              2 |    4712 | B        |
|              2 |    4713 | E        |
|              3 |    4711 | A        |
|              3 |    4712 | C        |
|              3 |    4713 | C        |
|              4 |    4711 | A        |
|              4 |    4712 | D        |
|              4 |    4713 | E        |
|              5 |    4711 | A        |
|              5 |    4712 | D        |
|              5 |    4713 | C        |
|              6 |    4711 | A        |
|              6 |    4712 | D        |
|              6 |    4713 | E        |
+----------------+---------+----------+

I am asking this in a different question because in the other question I have not specified that I need this to be dynamic and not only the 3 articles from above. Here you can find the old question.

Upvotes: 1

Views: 671

Answers (2)

Daniel Gimenez
Daniel Gimenez

Reputation: 20494

To create this result you'll need to determine two things:

  • A set of data with ids for the the number of possible combinations.
  • A way to determine when an article/supplier pair corresponds to a combination id.

In order to figure out the set of combinations, you need to first figure out the number of combinations. To do that, you need to figure out the size of each supplier set for each article. Getting the size is accomplished easily with a count aggregate function, but in order to figure out the combinations, I needed a product of all the values, which is not easily done. Fortunately there was an answer on how to do this in another SO questions.

Now that the number of combinations is determined, the ids have to be generated. There is no great way to do this in TSQL. I ended up using a simple recursive CTE. This has the drawback of only being able to produce up to 32767 combos. There are other methods to produce these values if you're going to need more.

To determine when an article/supplier pair lines up with a combination I use ROW_NUMBER window function partition by Article and sorted by Supplier to get a sequence number for each pair. Then it's using an old trick of using Modulo of the combination number by the sequence number to determine if a pair is displayed.

There is still an issue in that there is no guarantee that redundant pairs won't be paired together. In order to address this a CTE was added that calculates the number of possible combinations that come before an article. The intention is so that a value in a later article is repeated the number of times for a combination before the next in sequence is displayed. I called it multiplier (even though I divide the comboId by it) and this is what will ensure distinct results.

WITH ComboId AS (
    -- Product from https://stackoverflow.com/questions/3912204/why-is-there-no-product-aggregate-function-in-sql
    SELECT 0 [ComboId], exp (sum (log (SequenceCount))) MaxCombos
    FROM (
        SELECT COUNT(*) SequenceCount
        FROM src 
        GROUP BY Article
    ) x
    UNION ALL
    SELECT ComboId + 1, MaxCombos
    FROM ComboId
    WHERE ComboId + 1 < MaxCombos
)
, Multiplier AS (
    SELECT s.Article
        , COALESCE(exp (sum (log (SequenceCount)) OVER (ORDER BY Article ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)), 1) [Multiplier]
    FROM (
        SELECT Article, COUNT(*) SequenceCount
        FROM src
        GROUP BY Article
    ) s
)
, Sequenced AS (
    SELECT s.Article, s.Supplier, m.Multiplier
        , ROW_NUMBER() OVER (PARTITION BY s.Article ORDER BY s.Supplier) - 1 ArtSupplierSeqNum
        , COUNT(*) OVER (PARTITION BY s.Article) MaxArtSupplierSeqNum
    FROM src s
    INNER JOIN Multiplier m ON m.Article = s.Article
)
SELECT c.ComboId + 1 [ComboId], s.Article, s.Supplier
FROM ComboId c
INNER JOIN Sequenced s ON s.ArtSupplierSeqNum = CAST(c.ComboId / Multiplier as INT) % s.MaxArtSupplierSeqNum
ORDER BY ComboId, Article
OPTION (MAXRECURSION 32767)

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269503

If you want to generate all combinations of articles, then you can use a recursive CTE. However, the trick is to store all the ids at each level -- but that requires using a string (of some sort).

So, for the articles, you can do:

with a as (
      select distinct article from t
     ),
     cte as (
      select a.article, 1 as lev, convert(varchar(max), article) as grp
      from a
      union all
      select a.article, lev + 1, concat(grp, ':', a.article)
      from cte join
           a
           on a.article > cte.article
     )
select cte.combination_number, s.value
from (select cte.*, row_number() over (order by lev, grp) as combination_number
      from cte
     ) cte cross apply
     string_split(grp, ':') s
order by cte.combination_number, s.value;

You also seem to want one of the suppliers as well. You can arbitrarily choose a supplier -- but I'm not sure what rules you might want for that.

Here is a db<>fiddle.

EDIT:

If you want the article supplier pairs considered as a unit -- but different articles in each combination -- you can tweak the above query:

with a as (
      select distinct article from t
     ),
     cte as (
      select a.article, 1 as lev, convert(varchar(max), article) as grp
      from a
      union all
      select a.article, lev + 1, concat(grp, ':', a.article)
      from cte join
           a
           on a.article > cte.article
     )
select cte.combination_number, s.value
from (select cte.*, row_number() over (order by lev, grp) as combination_number
      from cte
     ) cte cross apply
     string_split(grp, ':') s
order by cte.combination_number, s.value;

The db<>fiddle has this solution as well.

Upvotes: 3

Related Questions