Reputation: 159
I have two tables with seasonal discounts. In each of these two tables are non overlapping date ranges, product id and discount that applies in that date range. Date ranges from one table however may overlap with date ranges in the other table. Given a third table with product id and its default price, the goal is to efficiently calculate seasonal - date ranged prices for product id after discounts from both tables have been applied.
Discounts multiply only in their overlapping period, e.g. if a first discount is 0.9 (10%) from 2019-07-01 to 2019-07-30, and a second discount is 0.8 from 2019-07-16 to 2019-08-15, this translates to: 0.9 discount from 2019-07-01 to 2019-07-15, 0.72 discount from 2019-07-16 to 2019-07-30, and 0.8 discount from 2019-07-31 to 2019-08-15.
I have managed to come to a solution, by first generating a table that holds ordered all of start and end dates in both discount tables, then generating a resulting table of all smallest disjoint intervals, and then for each interval, generating all prices, default, price with only the discount from first table applied (if any applies), price with only the discount from second table applied (if any applies), price with both discounts applied (if so possible) and then taking a min of these four prices. See sample code bellow.
declare @pricesDefault table (product_id int, price decimal)
insert into @pricesDefault
values
(1, 100),
(2, 120),
(3, 200),
(4, 50)
declare @discountTypeA table (product_id int, modifier decimal(4,2), startdate datetime, enddate datetime)
insert into @discountTypeA
values
(1, 0.75, '2019-06-06', '2019-07-06'),
(1, 0.95, '2019-08-06', '2019-08-20'),
(1, 0.92, '2019-05-06', '2019-06-05'),
(2, 0.75, '2019-06-08', '2019-07-19'),
(2, 0.95, '2019-07-20', '2019-09-20'),
(3, 0.92, '2019-05-06', '2019-06-05')
declare @discountTypeB table (product_id int, modifier decimal(4,2), startdate datetime, enddate datetime)
insert into @discountTypeB
values
(1, 0.85, '2019-06-20', '2019-07-03'),
(1, 0.65, '2019-08-10', '2019-08-29'),
(1, 0.65, '2019-09-10', '2019-09-27'),
(3, 0.75, '2019-05-08', '2019-05-19'),
(2, 0.95, '2019-05-20', '2019-05-21'),
(3, 0.92, '2019-09-06', '2019-09-09')
declare @pricingPeriod table(product_id int, discountedPrice decimal, startdate datetime, enddate datetime);
with allDates(product_id, dt) as
(select distinct product_id, dta.startdate from @discountTypeA dta
union all
select distinct product_id, dta.enddate from @discountTypeA dta
union all
select distinct product_id, dtb.startdate from @discountTypeB dtb
union all
select distinct product_id, dtb.enddate from @discountTypeB dtb
),
allproductDatesWithId as
(select product_id, dt, row_number() over (partition by product_id order by dt asc) 'Id'
from allDates),
sched as
(select pd.product_id, apw1.dt startdate, apw2.dt enddate
from @pricesDefault pd
join allproductDatesWithId apw1 on apw1.product_id = pd.product_id
join allproductDatesWithId apw2 on apw2.product_id = pd.product_id and apw2.Id= apw1.Id+1
),
discountAppliedTypeA as(
select sc.product_id, sc.startdate, sc.enddate,
min(case when sc.startdate >= dta.startdate and dta.enddate >= sc.enddate then pd.price * dta.modifier else pd.price end ) 'price'
from sched sc
join @pricesDefault pd on pd.product_id = sc.product_id
left join @discountTypeA dta on sc.product_id = dta.product_id
group by sc.product_id, sc.startdate , sc.enddate ),
discountAppliedTypeB as(
select daat.product_id, daat.startdate, daat.enddate,
min(case when daat.startdate >= dta.startdate and dta.enddate >= daat.enddate then daat.price * dta.modifier else daat.price end ) 'price'
from discountAppliedTypeA daat
left join @discountTypeB dta on daat.product_id = dta.product_id
group by daat.product_id, daat.startdate , daat.enddate )
select * from discountAppliedTypeB
order by product_id, startdate
Calculating a min of all possible prices is unnecessary overhead. I'd like to generate, just one resulting price and have it as a final price.
Here is the resulting set:
product_id start_date end_date final_price
1 2019-05-06 00:00:00.000 2019-06-05 00:00:00.000 92.0000
1 2019-06-05 00:00:00.000 2019-06-06 00:00:00.000 100.0000
1 2019-06-06 00:00:00.000 2019-06-20 00:00:00.000 75.0000
1 2019-06-20 00:00:00.000 2019-07-03 00:00:00.000 63.7500
1 2019-07-03 00:00:00.000 2019-07-06 00:00:00.000 75.0000
1 2019-07-06 00:00:00.000 2019-08-06 00:00:00.000 100.0000
1 2019-08-06 00:00:00.000 2019-08-10 00:00:00.000 95.0000
1 2019-08-10 00:00:00.000 2019-08-20 00:00:00.000 61.7500
1 2019-08-20 00:00:00.000 2019-08-29 00:00:00.000 65.0000
1 2019-08-29 00:00:00.000 2019-09-10 00:00:00.000 100.0000
1 2019-09-10 00:00:00.000 2019-09-27 00:00:00.000 65.0000
2 2019-05-20 00:00:00.000 2019-05-21 00:00:00.000 114.0000
2 2019-05-21 00:00:00.000 2019-06-08 00:00:00.000 120.0000
2 2019-06-08 00:00:00.000 2019-07-19 00:00:00.000 90.0000
2 2019-07-19 00:00:00.000 2019-07-20 00:00:00.000 120.0000
2 2019-07-20 00:00:00.000 2019-09-20 00:00:00.000 114.0000
3 2019-05-06 00:00:00.000 2019-05-08 00:00:00.000 184.0000
3 2019-05-08 00:00:00.000 2019-05-19 00:00:00.000 138.0000
3 2019-05-19 00:00:00.000 2019-06-05 00:00:00.000 184.0000
3 2019-06-05 00:00:00.000 2019-09-06 00:00:00.000 200.0000
3 2019-09-06 00:00:00.000 2019-09-09 00:00:00.000 184.0000
Is there a more efficient to this solution that I am not seeing?
I have a large data set of ~20K rows in real product prices table, and 100K- 200K rows in both discount tables.
Indexing structure of the actual tables is following: product id is clustered index in product prices table, whilst discount tables have an Id surrogate column as clustered index (as well as primary key), and (product_id, start_date, end_date) as a non clustered index.
Upvotes: 0
Views: 945
Reputation: 1270493
You can generate the dates using union
. Then bring in all discounts that are valid on that date, and calculate the total.
This looks like:
with prices as (
select a.product_id, v.dte
from @discountTypeA a cross apply
(values (a.startdate), (a.enddate)) v(dte)
union -- on purpose to remove duplicates
select b.product_id, v.dte
from @discountTypeB b cross apply
(values (b.startdate), (b.enddate)) v(dte)
),
p as (
select p.*, 1-a.modifier as a_discount, 1-b.modifier as b_discount, pd.price
from prices p left join
@pricesDefault pd
on pd.product_id = p.product_id left join
@discountTypeA a
on p.product_id = a.product_id and
p.dte >= a.startdate and p.dte < a.enddate left join
@discountTypeb b
on p.product_id = b.product_id and
p.dte >= b.startdate and p.dte < b.enddate
)
select p.product_id, price * (1 - coalesce(a_discount, 0)) * (1 - coalesce(b_discount, 0)) as price, a_discount, b_discount,
dte as startdate, lead(dte) over (partition by product_id order by dte) as enddate
from p
order by product_id, dte;
Here is a db<>fiddle.
Upvotes: 1
Reputation: 2814
Here is a version that works out the price for every date. You can then either use this directly, or use one of the many solutions on SO for working out date ranges.
In this example I have hard coded the date limits, but you could easily read them from your tables if you prefer.
I haven't done any performance testing on this, but give it a go. Its quite a bit simpler do if you have the right indexes it might be quicker.
;with dates as (
select convert(datetime,'2019-05-06') as d
union all
select d+1 from dates where d<'2019-09-27'
)
select pricesDefault.product_id, d, pricesDefault.price as baseprice,
discountA.modifier as dA,
discountB.modifier as dB,
pricesDefault.price*isnull(discountA.modifier,1)*isnull(discountB.modifier,1) as finalprice
from @pricesDefault pricesDefault
cross join dates
left join @discountTypeA discountA on discountA.product_id=pricesDefault.product_id and d between discountA.startdate and discountA.enddate
left join @discountTypeB discountB on discountB.product_id=pricesDefault.product_id and d between discountB.startdate and discountB.enddate
order by pricesDefault.product_id, d
Option (MaxRecursion 1000)
Upvotes: 0