Reputation: 23099
I have a table like so called billing
sku_code | invoice_id | sku_amount | sku_sale
integer | integer | float | float
What I want do is first find the top 350 sku_code
by sku_amount
SELECT TOP 350 Sum(sku_amount) AS amt,
sku_code
FROM billing
GROUP BY sku_code
ORDER BY amt DESC
I then want to slice the entire table by the table by the 350 sku_code
that appear above
I assume this would be some sort of inner-join
via a sub-query
but I can't figure out the syntax.
The query I want to finally run is but with it only returning the 350 sku_codes from the above query.
SELECT sum(sku_amount) as amt,sku_code, invoice_id
from billing
group by sku_code, invoice_id
order by amt DESC
This is giving me a table that will look like which should have around 20-30 million rows.
amt | sku_code | invoice_id
Upvotes: 1
Views: 250
Reputation: 11556
You can achieve this by giving a row_number
for the rows within CTE
and use the rows having the row_number less than or equal to 350.
Query
;with cte as(
select [rn] = row_number() over(
partition by [sku_code]
order by [amount] desc
), *
from [bill]
)
select sum([sku_amount]) as [amount]
from cte
where [rn] <= 350;
Upvotes: 1
Reputation: 1270503
If I understand correctly, you want the original rows for the top 350 skus. A JOIN
should suffice:
SELECT b.*
FROM billing b JOIN
(SELECT TOP 350 Sum(sku_amount) AS amt,
sku_code
FROM billing
GROUP BY sku_code
ORDER BY amt DESC
) s
ON s.sku_code = b.sku_code
Upvotes: 1
Reputation: 65363
If your DB version is 2012+, then use
OFFSET { integer_constant | offset_row_count_expression } { ROW | ROWS }
[
FETCH { FIRST | NEXT } {integer_constant | fetch_row_count_expression } { ROW | ROWS } ONLY
]
syntax as
SELECT SUM(sku_amount) AS amt, sku_code, invoice_id
FROM billing
GROUP BY sku_code, invoice_id
ORDER BY ROW_NUMBER() OVER (ORDER BY SUM(sku_amount) DESC)
OFFSET 0 ROWS
FETCH NEXT 350 ROWS ONLY
by using window analytic function ( row_number()
in this case ) in the order by clause without need of a subquery
Upvotes: 1
Reputation: 3801
If I'm understanding right you can just use rank. This will give you all records which have a sku_amount in the top 350
SELECT *
FROM (
SELECT billing.*,
RANK() OVER ( PARTITION BY 1 ORDER BY sku_amount DESC ) rnk
FROM billing
) TMP
WHERE rnk <= 350
Upvotes: 1