Reputation: 117
I am writing a a report and was wondering if there is there any way to find items that join into identical sets?
For example in the following query, I want to find all areas that join against the same set of products:
SELECT Area.Name, AggregateSetOfProductsId
FROM Area
INNER JOIN AreaToProduct ON AreaToProduct.AreaId = Area.Id
GROUP BY Area.Id
The current best solution that I could think of was writing a custom aggregate that hashed the ProductId uniqueidentifiers together (something like shift right based on some byte then xor them together) it but I figured I'd ask if there was an easier way before going through it.
Upvotes: 0
Views: 177
Reputation: 8357
You can do this with set operations. Here I show you how to obtain all orderid's from northwind which contain exactly the same products (which is what you're after, right? the areas with exactly the same products)
select distinct o1.orderid
from orders o1 inner join orders o2 on o1.orderid!=o2.orderid
where not exists
(
select * from [order details] od2
where od2.orderId=o2.Orderid
and od2.ProductId NOT IN
(
select productid from [order details] od1
where od1.OrderId=o1.OrderId
)
)
and not exists
(
select * from [order details] od1
where od1.orderId=o1.Orderid
and od1.ProductId NOT IN
(
select productid from [order details] od2
where od2.OrderId=o2.OrderId
)
)
The idea is basicly this: return order o1 for which an order o2 exists for which there aren't any products not in the list of products of o1 and where there aren't any products from o1 not in the list of products of o2.
Upvotes: 1
Reputation: 23226
You can easily do this using a Common Table Expression. You would employ FOR XML PATH to create an array, and then use that array to identify matching products. Like this:
;WITH AreaProductSets
AS
(
SELECT a.AreaID, a.Name,
SUBSTRING((SELECT (', ' + CAST(ap.ProductID AS varchar(20)))
FROM AreaToProduct ap
WHERE ap.AreaID = a.AreaID
ORDER BY ap.ProductID
FOR XML PATH ('')), 3, 2000) AS ProductArray
FROM Area AS a
)
SELECT SUBSTRING((SELECT (', ' + CAST(aps2.Name AS varchar(20)))
FROM AreaProductSets aps2
WHERE aps2.ProductArray = aps.ProductArray
ORDER BY aps2.AreaID
FOR XML PATH('')), 3, 2000) AS MatchingAreas,
aps.ProductArray
FROM (SELECT DISTINCT ProductArray FROM AreaProductSets) AS aps
Upvotes: 1
Reputation: 294307
Perhaps use the EXCEPT operator? If the EXCEPT between the two sets is empty, the sets are identical. See http://msdn.microsoft.com/en-us/library/ms188055.aspx Also, to hash a resultset you can use built in aggregates like CHECKSUM_AGG(BINARY_CHECKSUM(*)), see http://msdn.microsoft.com/en-us/library/ms188920.aspx
Upvotes: 0