wali
wali

Reputation: 239

Product sales by month - SQL

I just created a small data warehouse with the following details.

Fact Table

  1. Sales

Dimensions

  1. Supplier
  2. Products
  3. Time (Range is one year)
  4. Stores

I want to query which product has the max sales by month, I mean the output to be like

Month - Product Code - Num_Of_Items
JAN      xxxx            xxxxx
FEB      xxxx            xxxxx

I tried the following query

with product_sales as(
SELECT dd.month,
  fs.p_id,
  dp.title,
 SUM(number_of_items) Num
FROM fact_sales fs
INNER JOIN dim_products dp
ON fs.p_id = dp.p_id
INNER JOIN dim_date dd
ON dd.date_id = fs.date_id
GROUP BY dd.month,
  fs.p_id,
  dp.title
)
select distinct month,movie_id,max(num) 
from product_sales 
group by movie_id,title, month;

Instead of max of 12 rows, I am having 132 records. I need guidance with this. Thanks.

Upvotes: 1

Views: 2429

Answers (2)

schurik
schurik

Reputation: 7928

you can user MAX () KEEP (DENSE_RANK FIRST ORDER BY ) to select the movie_id with the max value of num

...
select 
   month, 
   MAX(movie_id) KEEP (DENSE_RANK FIRST  order by num desc) as movie_id, 
   MAX(num)
from 
 abc
group by month
;

Upvotes: 1

Ed Gibbs
Ed Gibbs

Reputation: 26343

There are a few things about your query that don't make sense, such as:

  • Where does movie_id come from?
  • What is from abc? Should it be from product_sales?

That said, if you need the maximum product sales by month and you need to include the product code (or movie ID or whatever), you need an analytical query. Yours would go something like this:

WITH product_sales AS (
  SELECT
    dd.month,
    fs.p_id,
    dp.title,
   SUM(number_of_items) Num,
   RANK() OVER (PARTITION BY dd.month ORDER BY SUM(number_of_items) DESC) NumRank
  FROM fact_sales fs
  INNER JOIN dim_products dp ON fs.p_id = dp.p_id
  INNER JOIN dim_date dd ON dd.date_id = fs.date_id
  GROUP BY dd.month, fs.p_id, dp.title
)
SELECT month, p_id, title, num
FROM product_sales
WHERE NumRank = 1

Note that if there's a tie for top sales in any month, this query will show all top sales for the month. In other words, if product codes AAAA and BBBB are tied for top sales in January, the query results will have a January row for both products.

If you want just one row per month even if there's a tie, use ROW_NUMBER instead of RANK(), but note that ROW_NUMBER will arbitrarily pick a winner unless you define a tie-breaker. For example, to have the lowest p_id be the tie-breaker, define the NumRank column like this:

ROW_NUMBER() OVER (
  PARTITION BY dd.month
  ORDER BY SUM(number_of_items) DESC, p_id
) NumRank

Upvotes: 2

Related Questions