Reputation: 399
I have a table (might end up using SQL Server) that looks like this:
I would like to put products into a final grouping based on same criteria.
For example prod-A, prod-B, prod-X all belong to same group as their criteria rows are same (can't be partial match).
Hence I would like the output to be something like this
Any help will be appreciated. Thanks!
Upvotes: 0
Views: 1274
Reputation: 4345
This one is possibly the most inefficient query ever written BUT it does do the job:
SELECT t.product, t.criteria, final.final_grouping
FROM t
INNER JOIN
(SELECT p.product, groups.final_grouping
FROM
(SELECT DISTINCT product, STUFF(
(SELECT DISTINCT ',' + criteria
FROM t sub
WHERE t.product = sub.product
FOR XML PATH ('')
)
, 1, 1, '') AS criterias
FROM t
GROUP BY product) p
INNER JOIN
(SELECT ROW_NUMBER() OVER(ORDER BY criterias) AS final_grouping, c.criterias
FROM
(SELECT DISTINCT STUFF(
(SELECT DISTINCT ',' + criteria
FROM t sub
WHERE t.product = sub.product
FOR XML PATH ('')
)
, 1, 1, '') AS criterias
FROM t) c) groups
ON p.criterias = groups.criterias) final
ON t.product = final.product
You can validate here: http://rextester.com/SZUECY8704
Upvotes: 0
Reputation: 1269873
In SQL Server, this is trickier. You can use similar logic as in Postgres, but the hack for using XML is yuckier.
Instead, let's start by getting all pairs of exact matches:
with i as (
select i.*, count(*) over (partition by product) as nump
from input
)
select i1.product, i2.product
from i i1 join
i i2
on i1.criteria = i2.criteria and
i1.nump = i2.nump
group by i1.product, i2.product
having count(*) = i1.nump; -- everything matches
From here, we can aggregate to get the ranking:
with i as (
select i.*, count(*) over (partition by product) as nump
from input
)
select i1.product, min(i2.product) as grp_product,
dense_rank() over (order by min(i2.product)) as grp
from (select i1.product, i2.product
from i i1 join
i i2
on i1.criteria = i2.criteria and
i1.nump = i2.nump
group by i1.product, i2.product
having count(*) = i1.nump -- everything matches
) ii
group by i1.product;
For each product, this now assigns the group number.
You can use a join
to assign the value to each row.
Although you can use the same technique in Postgres, I think array aggregation is a simpler approach.
Upvotes: 2