Reputation: 239
I just created a small data warehouse with the following details.
Fact Table
Dimensions
I want to query which product has the max sales by month, I mean the output to be like
Month - Product Code - Num_Of_Items
JAN xxxx xxxxx
FEB xxxx xxxxx
I tried the following query
with product_sales as(
SELECT dd.month,
fs.p_id,
dp.title,
SUM(number_of_items) Num
FROM fact_sales fs
INNER JOIN dim_products dp
ON fs.p_id = dp.p_id
INNER JOIN dim_date dd
ON dd.date_id = fs.date_id
GROUP BY dd.month,
fs.p_id,
dp.title
)
select distinct month,movie_id,max(num)
from product_sales
group by movie_id,title, month;
Instead of max of 12 rows, I am having 132 records. I need guidance with this. Thanks.
Upvotes: 1
Views: 2429
Reputation: 7928
you can user MAX () KEEP (DENSE_RANK FIRST ORDER BY ) to select the movie_id with the max value of num
...
select
month,
MAX(movie_id) KEEP (DENSE_RANK FIRST order by num desc) as movie_id,
MAX(num)
from
abc
group by month
;
Upvotes: 1
Reputation: 26343
There are a few things about your query that don't make sense, such as:
movie_id
come from?from abc
? Should it be from product_sales
?That said, if you need the maximum product sales by month and you need to include the product code (or movie ID or whatever), you need an analytical query. Yours would go something like this:
WITH product_sales AS (
SELECT
dd.month,
fs.p_id,
dp.title,
SUM(number_of_items) Num,
RANK() OVER (PARTITION BY dd.month ORDER BY SUM(number_of_items) DESC) NumRank
FROM fact_sales fs
INNER JOIN dim_products dp ON fs.p_id = dp.p_id
INNER JOIN dim_date dd ON dd.date_id = fs.date_id
GROUP BY dd.month, fs.p_id, dp.title
)
SELECT month, p_id, title, num
FROM product_sales
WHERE NumRank = 1
Note that if there's a tie for top sales in any month, this query will show all top sales for the month. In other words, if product codes AAAA
and BBBB
are tied for top sales in January, the query results will have a January row for both products.
If you want just one row per month even if there's a tie, use ROW_NUMBER
instead of RANK()
, but note that ROW_NUMBER
will arbitrarily pick a winner unless you define a tie-breaker. For example, to have the lowest p_id
be the tie-breaker, define the NumRank
column like this:
ROW_NUMBER() OVER (
PARTITION BY dd.month
ORDER BY SUM(number_of_items) DESC, p_id
) NumRank
Upvotes: 2