Chris
Chris

Reputation: 117

Determining items that join against the same set in T-SQL

I am writing a a report and was wondering if there is there any way to find items that join into identical sets?

For example in the following query, I want to find all areas that join against the same set of products:

SELECT Area.Name, AggregateSetOfProductsId
FROM Area
INNER JOIN AreaToProduct ON AreaToProduct.AreaId = Area.Id
GROUP BY Area.Id

The current best solution that I could think of was writing a custom aggregate that hashed the ProductId uniqueidentifiers together (something like shift right based on some byte then xor them together) it but I figured I'd ask if there was an easier way before going through it.

Upvotes: 0

Views: 177

Answers (3)

Frans Bouma
Frans Bouma

Reputation: 8357

You can do this with set operations. Here I show you how to obtain all orderid's from northwind which contain exactly the same products (which is what you're after, right? the areas with exactly the same products)

select distinct o1.orderid 
from orders o1 inner join orders o2 on o1.orderid!=o2.orderid
where not exists
    (
        select * from [order details] od2
        where od2.orderId=o2.Orderid
            and od2.ProductId NOT IN 
            (
                select productid from [order details] od1
                where od1.OrderId=o1.OrderId
            )
    )
    and not exists
    (
        select * from [order details] od1
        where od1.orderId=o1.Orderid
            and od1.ProductId NOT IN 
            (
                select productid from [order details] od2
                where od2.OrderId=o2.OrderId
            )
    )

The idea is basicly this: return order o1 for which an order o2 exists for which there aren't any products not in the list of products of o1 and where there aren't any products from o1 not in the list of products of o2.

Upvotes: 1

Aaron Alton
Aaron Alton

Reputation: 23226

You can easily do this using a Common Table Expression. You would employ FOR XML PATH to create an array, and then use that array to identify matching products. Like this:

;WITH AreaProductSets
AS
(
SELECT a.AreaID, a.Name, 
                SUBSTRING((SELECT (', ' + CAST(ap.ProductID AS varchar(20)))
                                    FROM AreaToProduct ap
                                    WHERE ap.AreaID = a.AreaID
                                    ORDER BY ap.ProductID
                                    FOR XML PATH ('')), 3, 2000) AS ProductArray
FROM        Area AS a
)
SELECT  SUBSTRING((SELECT   (', ' + CAST(aps2.Name AS varchar(20)))
                                    FROM        AreaProductSets aps2
                                    WHERE   aps2.ProductArray = aps.ProductArray
                                    ORDER BY aps2.AreaID
                                    FOR XML PATH('')), 3, 2000) AS MatchingAreas,
                aps.ProductArray
FROM        (SELECT DISTINCT ProductArray FROM AreaProductSets) AS aps

Upvotes: 1

Remus Rusanu
Remus Rusanu

Reputation: 294307

Perhaps use the EXCEPT operator? If the EXCEPT between the two sets is empty, the sets are identical. See http://msdn.microsoft.com/en-us/library/ms188055.aspx Also, to hash a resultset you can use built in aggregates like CHECKSUM_AGG(BINARY_CHECKSUM(*)), see http://msdn.microsoft.com/en-us/library/ms188920.aspx

Upvotes: 0

Related Questions