Reputation: 3751
I have a products
table and a sales
table that keeps record of how many items a given product sold during each date. Of course, not all products have sales everyday.
I need to generate a report that tells me how many consecutive days a product has had sales (from the latest date to the past) and how many items it sold during those days only.
I'd like to tell you how many things I've tried so far, but the only succesful (and slow, recursive) ones are solutions inside my application and not inside SQL, which is what I want.
I also have browsed several similar questions on SO but I haven't found one that lets me have a clear idea of what I really need.
I've setup a SQLFiddle here to show you what I'm talking about. There you will see the only query I can think of, which doesn't give me the result I need. I also added comments there showing what the result of the query should be.
I hope someone here knows how to accomplish that. Thanks in advance for any comments!
Francisco
Upvotes: 4
Views: 2328
Reputation: 17048
A solution in pure SQL without store procedure : Fiddle
SELECT sku
, COUNT(1) AS consecutive_days
, SUM(items) AS items
FROM
(
SELECT sku
, items
-- generate a new guid for each group of consecutive date
-- ie : starting with day_before is null
, @guid := IF(@sku = sku and day_before IS NULL, UUID(), @guid) AS uuid
, @sku := sku AS dummy_sku
FROM
(
SELECT currents.sku
, befores.record_date as day_before
, currents.items
FROM sales currents
LEFT JOIN sales befores
ON currents.sku = befores.sku
AND currents.record_date = befores.record_date + INTERVAL 1 DAY
ORDER BY currents.sku, currents.record_date
) AS main_join
CROSS JOIN (SELECT @sku:=0) foo_sku
CROSS JOIN (SELECT @guid:=UUID()) foo_guid
) AS result_to_group
GROUP BY uuid, sku
The query is really not that hard. Declare variables via cross join (SELECT @type:=0) type
. Then in the selects, you can set variables value row by row. It is necessary for simulating Rank function.
Upvotes: 3
Reputation: 1
select
p.*,
sum(s.items) sales,
count(s.record_date) days_sold
from
products p
join
sales s
on
s.sku = p.sku
where record_date between '2013-04-18 00:00:00' and '2013-04-26 00:00:00'
group by sku;
Upvotes: -1
Reputation: 696
http://sqlfiddle.com/#!2/20108/1
Here is a store procedure that do the job
CREATE PROCEDURE myProc()
BEGIN
-- Drop and create the temp table
DROP TABLE IF EXISTS reached;
CREATE TABLE reached (
sku CHAR(32) PRIMARY KEY,
record_date date,
nb int,
total int)
ENGINE=HEAP;
-- Initial insert, the starting point is the MAX sales record_date of each product
INSERT INTO reached
SELECT products.sku, max(sales.record_date), 0, 0
FROM products
join sales on sales.sku = products.sku
group by products.sku;
-- loop until there is no more updated rows
iterloop: LOOP
-- Update the temptable with the values of the date - 1 row if found
update reached
join sales on sales.sku=reached.sku and sales.record_date=reached.record_date
set reached.record_date = reached.record_date - INTERVAL 1 day,
reached.nb=reached.nb+1,
reached.total=reached.total + sales.items;
-- If no more rows are updated it means we hit the most longest days_sold
IF ROW_COUNT() = 0 THEN
LEAVE iterloop;
END IF;
END LOOP iterloop;
-- select the results of the temp table
SELECT products.sku, products.title, products.price, reached.total as sales, reached.nb as days_sold
from reached
join products on products.sku=reached.sku;
END//
Then you just have to do
call myProc()
Upvotes: 3