Reputation: 91
I've been struggling to work this out all afternoon - it seems pretty simple but I must be missing something!
I've got a query which returns some data, two of the columns it returns are "PackageWeight" and "PackageGroup". Essentially, I want to filter this data down to show only one row for each "PackageGroup" - that should be the row with the highest value in the "PackageWeight" column.
It seems simple but I just can't get it to work in SQL Server using a combination of TOP 1 and GROUP BY. I must be missing something!
SELECT VendorID, PackageID, PackageWeight, PackageGroup
FROM (SELECT VendorID, COUNT(*) AS qty
FROM VendorServices
GROUP BY VendorID
) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
FROM PackageServices
JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
FROM VendorServices cs2
JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
WHERE cs2.VendorID = cs.VendorID
AND ps2.PackageID = ps.PackageID
) = ps.qty
This query returns me the complete dataset, that I need to filter down. However my attempts so far have failed :(
Any help much appreciated!
EDIT - Thanks to contributors below, so far I have the following query:
with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
RANK() over (partition by PackageGroup order by PackageWeight desc) as [rank]
FROM (SELECT VendorID, COUNT(*) AS qty
FROM VendorServices
GROUP BY VendorID
) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
FROM PackageServices
JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
FROM VendorServices cs2
JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
WHERE cs2.VendorID = cs.VendorID
AND ps2.PackageID = ps.PackageID
) = ps.qty
)
select *
from result_cte
WHERE [rank] = 1
ORDER BY VendorID
So far, so good. I'll still take a look at the APPLY operator suggested by @gbn as this is new to me - and I still need to do some testing to ensure that this query works 100% of the time. However initial indications are good!
Thanks to all who have contributed so far.
EDIT 2 - Sadly, after populating the database with more example data, this query failed to work. It seems to miss out some entries.
Perhaps I need to explain a little more about what is going on here. The data being returned by my original query lists every customer in the system, along with the derived PackageID (calculated by that query) and the weight and group assigned to that Package in a lookup table.
I need to filter the original results table so that I get no more than one package from each group, for each customer (each customer may have a package from one or more group but might not have a package from every group)
I'll take a fresher look at this tomorrow as I think I might be in a 'Can't see the wood for the trees' situation!
Thanks all.
Upvotes: 2
Views: 24000
Reputation: 1784
Can you try this? It's not bulletproof if you have multiple records with the same weight in the same group. There are other ways to handle it.
with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup
FROM (SELECT VendorID, COUNT(*) AS qty
FROM VendorServices
GROUP BY VendorID
) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
FROM PackageServices
JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
FROM VendorServices cs2
JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
WHERE cs2.VendorID = cs.VendorID
AND ps2.PackageID = ps.PackageID
) = ps.qty
)
select *
from result_cte
where result_cte.PackageWeight = (select top 1 highestweight.PackageWeight from result_cte highestweight
where highestweight.PackageGroup = result_cte.PackageGroup
order by highestweight.PackageWeight desc)
Or you can do this:
with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
ROW_NUMBER() over (partition by PackageGroup order by PackageWeight desc) as [row]
FROM (SELECT VendorID, COUNT(*) AS qty
FROM VendorServices
GROUP BY VendorID
) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
FROM PackageServices
JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
FROM VendorServices cs2
JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
WHERE cs2.VendorID = cs.VendorID
AND ps2.PackageID = ps.PackageID
) = ps.qty
)
select *
from result_cte
where [row] = 1
Upvotes: 3
Reputation: 91
Thanks to the post by Eric.K.Yung - I finally solved this using his query but adding VendorID (effectively CustomerID) to the 'partition by' part of the query. This ensured that packages were returned for all customers.
Thanks to all who contributed. The final query is:
with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
ROW_NUMBER() over (partition by PackageGroup, VendorID order by PackageWeight desc) as [row]
FROM (SELECT VendorID, COUNT(*) AS qty
FROM VendorServices
GROUP BY VendorID
) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
FROM PackageServices
JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
FROM VendorServices cs2
JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
WHERE cs2.VendorID = cs.VendorID
AND ps2.PackageID = ps.PackageID
) = ps.qty
)
select *
from result_cte
where [row] = 1
Upvotes: 0
Reputation: 137
You could use the MAX function:
SELECT * FROM #one
lbs groups
5 0
4 0
1 0
9 1
2 1
SELECT groups,MAX(lbs)
FROM #one
GROUP BY groups
groups (No column name)
0 5
1 9
Upvotes: 1
Reputation: 133
Are you are willing to accept a single arbitrary vendor and PackageID
if multiple packages have the same max weight in a group? If OK, just put an aggregate on them as well as the PackageWeight
:
SELECT max(VendorID), max(PackageID), max(PackageWeight), PackageGroup
...
GROUP BY PackageGroup
Otherwise, you will need to do as E.Y. suggests and do the nested query to first find the max weight per group and process the duplicates, if any, yourself.
Upvotes: 0