Reputation: 21862
Building off of the question "SQL - Turn relationship IDs into a delimited list", I'd like to further group the results by unique sets of sites. For example, using the same data in that question:
Sites 15 & 16 together share one permutation of a,b,c. Sites 18, 19, 20 share another permutation. Site 17 uses BOTH permutations.
I'd like to find a query with results similar to the linked question, but without any overlap in the site_ids:
I'm not sure that it's possible with just SQL, and I may just have to build something to crunch the data in code, but I'm crossing my fingers...
I'm using SQL Server, but it would also be nice to know how to do it with postgres.
UPDATE:
Someone suggested that Simulating group_concat MySQL function in Microsoft SQL Server 2005? might be a duplicate. From my (limited) understanding of group_concat, it appears that it might duplicate the question referenced above, but not this question. This question needs no overlap in site_ids.
Upvotes: 0
Views: 136
Reputation: 16377
If you're okay with an array (which sort of manifests itself as a comma delimited string the way you have it), here is a one possible approach in PostgreSQL.
with perms as (
select
f.a, f.b, f.c, f.site_id,
count(1) over (partition by f.site_id) as cnt
from
your_table f
)
select
p1.a, p1.b, p1.c,
array (select p2.site_id
from perms p2
where
p1.a = p2.a and
p1.b = p2.b and
p1.c = p2.c and
p1.cnt = p2.cnt)
from perms p1
group by
p1.a, p1.b, p1.c, p1.cnt
The results would look like this:
bar baz blah {20,18,19}
foo bar baz {17}
foo bar baz {16,15}
bar baz blah {17}
Upvotes: 0
Reputation: 47444
I don't know if this is the most efficient way to do it, but it sure was fun coming up with it. :)
;WITH CTE_unique_groups AS (SELECT DISTINCT a, b, c FROM dbo.My_Table),
CTE_numbered_groups AS (SELECT a, b, c, ROW_NUMBER() OVER(ORDER BY a, b, c) AS row_num FROM CTE_unique_groups),
CTE_exponented_groups AS (SELECT a, b, c, POWER(2, row_num) AS group_value FROM CTE_numbered_groups),
CTE_unique_sets AS (
SELECT
MT.site_id,
SUM(grps.group_value) AS total_group_value
FROM
CTE_exponented_groups AS grps
INNER JOIN dbo.My_Table MT ON MT.a = grps.a AND MT.b = grps.b AND MT.c = grps.c
GROUP BY
MT.site_id
),
CTE_grouped_sites AS
(SELECT total_group_value, STUFF((SELECT ', ' + CAST(site_id AS VARCHAR(10)) FROM CTE_unique_sets t2 WHERE t2.total_group_value = t1.total_group_value FOR XML PATH('')), 1, 2, '') AS site_ids
FROM CTE_unique_sets t1)
SELECT DISTINCT
x.a, x.b, x.c, gs.site_ids
FROM
CTE_grouped_sites gs
INNER JOIN CTE_exponented_groups x ON
gs.total_group_value & x.group_value = x.group_value
The gist of it is that first you have to identify every unique set of groups. Once you've done that, you can then use the bitwise &
operator to explode the site IDs back into the groups to which they belonged.
Upvotes: 3