JimmE
JimmE

Reputation: 91

Filtering of results returned by SQL query

I've been struggling to work this out all afternoon - it seems pretty simple but I must be missing something!

I've got a query which returns some data, two of the columns it returns are "PackageWeight" and "PackageGroup". Essentially, I want to filter this data down to show only one row for each "PackageGroup" - that should be the row with the highest value in the "PackageWeight" column.

It seems simple but I just can't get it to work in SQL Server using a combination of TOP 1 and GROUP BY. I must be missing something!

    SELECT VendorID, PackageID, PackageWeight, PackageGroup
  FROM (SELECT VendorID, COUNT(*) AS qty
          FROM VendorServices
         GROUP BY VendorID
       ) cs
  JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
          FROM PackageServices
          JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
          GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
       ) ps ON cs.qty >= ps.qty
  WHERE (SELECT COUNT(*)
          FROM VendorServices cs2
          JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
         WHERE cs2.VendorID = cs.VendorID
           AND ps2.PackageID = ps.PackageID
       ) = ps.qty

This query returns me the complete dataset, that I need to filter down. However my attempts so far have failed :(

Any help much appreciated!

EDIT - Thanks to contributors below, so far I have the following query:

with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
    RANK() over (partition by PackageGroup order by PackageWeight desc) as [rank]
FROM (SELECT VendorID, COUNT(*) AS qty
    FROM VendorServices
    GROUP BY VendorID
    ) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
    FROM PackageServices
    JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
    GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
    ) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
    FROM VendorServices cs2
    JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
    WHERE cs2.VendorID = cs.VendorID
    AND ps2.PackageID = ps.PackageID
    ) = ps.qty
)

select *
from result_cte
WHERE [rank] = 1
ORDER BY VendorID

So far, so good. I'll still take a look at the APPLY operator suggested by @gbn as this is new to me - and I still need to do some testing to ensure that this query works 100% of the time. However initial indications are good!

Thanks to all who have contributed so far.

EDIT 2 - Sadly, after populating the database with more example data, this query failed to work. It seems to miss out some entries.

Perhaps I need to explain a little more about what is going on here. The data being returned by my original query lists every customer in the system, along with the derived PackageID (calculated by that query) and the weight and group assigned to that Package in a lookup table.

I need to filter the original results table so that I get no more than one package from each group, for each customer (each customer may have a package from one or more group but might not have a package from every group)

I'll take a fresher look at this tomorrow as I think I might be in a 'Can't see the wood for the trees' situation!

Thanks all.

Upvotes: 2

Views: 24000

Answers (4)

Eric K Yung
Eric K Yung

Reputation: 1784

Can you try this? It's not bulletproof if you have multiple records with the same weight in the same group. There are other ways to handle it.

with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup
FROM (SELECT VendorID, COUNT(*) AS qty
    FROM VendorServices
    GROUP BY VendorID
    ) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
    FROM PackageServices
    JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
    GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
    ) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
    FROM VendorServices cs2
    JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
    WHERE cs2.VendorID = cs.VendorID
    AND ps2.PackageID = ps.PackageID
    ) = ps.qty
)

select *
from result_cte
where result_cte.PackageWeight = (select top 1 highestweight.PackageWeight from result_cte highestweight
                                where highestweight.PackageGroup = result_cte.PackageGroup
                                order by highestweight.PackageWeight desc)

Or you can do this:

with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
    ROW_NUMBER() over (partition by PackageGroup order by PackageWeight desc) as [row]
FROM (SELECT VendorID, COUNT(*) AS qty
    FROM VendorServices
    GROUP BY VendorID
    ) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
    FROM PackageServices
    JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
    GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
    ) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
    FROM VendorServices cs2
    JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
    WHERE cs2.VendorID = cs.VendorID
    AND ps2.PackageID = ps.PackageID
    ) = ps.qty
)

select *
from result_cte
where [row] = 1

Upvotes: 3

JimmE
JimmE

Reputation: 91

Thanks to the post by Eric.K.Yung - I finally solved this using his query but adding VendorID (effectively CustomerID) to the 'partition by' part of the query. This ensured that packages were returned for all customers.

Thanks to all who contributed. The final query is:

with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
    ROW_NUMBER() over (partition by PackageGroup, VendorID order by PackageWeight desc) as [row]
FROM (SELECT VendorID, COUNT(*) AS qty
    FROM VendorServices
    GROUP BY VendorID
    ) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
    FROM PackageServices
    JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
    GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
    ) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
    FROM VendorServices cs2
    JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
    WHERE cs2.VendorID = cs.VendorID
    AND ps2.PackageID = ps.PackageID
    ) = ps.qty
)

select *
from result_cte
where [row] = 1

Upvotes: 0

tradotto
tradotto

Reputation: 137

You could use the MAX function:

SELECT * FROM #one
lbs groups
5   0
4   0
1   0
9   1
2   1     

SELECT groups,MAX(lbs)
FROM #one
GROUP BY groups

groups  (No column name)
0   5
1   9

Upvotes: 1

James
James

Reputation: 133

Are you are willing to accept a single arbitrary vendor and PackageID if multiple packages have the same max weight in a group? If OK, just put an aggregate on them as well as the PackageWeight:

SELECT max(VendorID), max(PackageID), max(PackageWeight), PackageGroup
...
GROUP BY PackageGroup

Otherwise, you will need to do as E.Y. suggests and do the nested query to first find the max weight per group and process the duplicates, if any, yourself.

Upvotes: 0

Related Questions