Anonymouse
Anonymouse

Reputation: 33

Get non-overlapping dates ranges for prices history data

Let's assume that I have the following table:

CREATE TABLE [dbo].[PricesHist]
(
   [Product] [varchar](6) NOT NULL,
   [Price] [float] NOT NULL,
   [StartDate] [datetime] NOT NULL,
   [EndDate] [datetime] NOT NULL
)


INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D2C00000000 AS DateTime), CAST(0x00009D2C00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D2D00000000 AS DateTime), CAST(0x00009D2D00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 2.5, CAST(0x00009D2E00000000 AS DateTime), CAST(0x00009D2E00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3000000000 AS DateTime), CAST(0x00009D3000000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3100000000 AS DateTime), CAST(0x00009D3100000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3400000000 AS DateTime), CAST(0x00009D3400000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 2.5, CAST(0x00009D3500000000 AS DateTime), CAST(0x00009D3500000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3600000000 AS DateTime), CAST(0x00009D3600000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3700000000 AS DateTime), CAST(0x00009D3700000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3800000000 AS DateTime), CAST(0x00009D3800000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3A00000000 AS DateTime), CAST(0x00009D3A00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3B00000000 AS DateTime), CAST(0x00009D3B00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 2.5, CAST(0x00009D3C00000000 AS DateTime), CAST(0x00009D3C00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3D00000000 AS DateTime), CAST(0x00009D3D00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3E00000000 AS DateTime), CAST(0x00009D3E00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D3F00000000 AS DateTime), CAST(0x00009D3F00000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D4100000000 AS DateTime), CAST(0x00009D4100000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D4200000000 AS DateTime), CAST(0x00009D4200000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 2.5, CAST(0x00009D4300000000 AS DateTime), CAST(0x00009D4300000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D4400000000 AS DateTime), CAST(0x00009D4400000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D4500000000 AS DateTime), CAST(0x00009D4500000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D4600000000 AS DateTime), CAST(0x00009D4600000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 4.9, CAST(0x00009D4800000000 AS DateTime), CAST(0x00009D4800000000 AS DateTime))
INSERT [dbo].[PricesHist] ([Product], [Price], [StartDate], [EndDate]) VALUES (N'Apples', 2.5, CAST(0x00009D4A00000000 AS DateTime), CAST(0x00009D4A00000000 AS DateTime))

As you can see, there are two prices on that month for Apples. 4.90 and 2.50. In order to tidy this table up, I need to get this information as a date range rather than a row per day as it currently is.

I can obviously do this with Min and Max aggregates easily but the ranges overlap and other business code expect non-overlapping ranges. I also tried to achieve this with self joins and row_number(), but without much success...

Here is what I'm trying to achieve as the output:

Product | StartDate   |  EndDate    | Price
-------------------------------------------
Apples  | 01 Mar 2010 | 02 Mar 2010 | 4.90
Apples  | 03 Mar 2010 | 03 Mar 2010 | 2.50
Apples  | 05 Mar 2010 | 09 Mar 2010 | 4.90
Apples  | 10 Mar 2010 | 10 Mar 2010 | 2.50
Apples  | 11 Mar 2010 | 16 Mar 2010 | 4.90
Apples  | 17 Mar 2010 | 17 Mar 2010 | 2.50
Apples  | 18 Mar 2010 | 23 Mar 2010 | 4.90
Apples  | 24 Mar 2010 | 24 Mar 2010 | 2.50
Apples  | 25 Mar 2010 | 30 Mar 2010 | 4.90
Apples  | 31 Mar 2010 | 31 Mar 2010 | 2.50

What would please be the best approach to get this done?

Thanks a lot in advance,

Upvotes: 3

Views: 1156

Answers (1)

Tom H
Tom H

Reputation: 47392

This should get you pretty close. Once you determine how you want to handle missing dates, you can adjust it appropriately:

SELECT
    SD.Product,
    SD.Price,
    SD.StartDate,
    MAX(ED.EndDate) AS EndDate
FROM
    dbo.PricesHist SD
LEFT OUTER JOIN dbo.PricesHist ED ON
    ED.Product = SD.Product AND
    ED.Price = SD.Price
LEFT OUTER JOIN dbo.PricesHist LD ON
    LD.Product = SD.Product AND
    LD.Price <> SD.Price AND
    LD.EndDate < SD.StartDate
LEFT OUTER JOIN dbo.PricesHist LMD ON
    LMD.Product = SD.Product AND
    LMD.Price = SD.Price AND
    LMD.StartDate > ISNULL(LD.EndDate, '1900-01-01') AND
    LMD.StartDate < SD.StartDate
WHERE
    NOT EXISTS (SELECT * FROM dbo.PricesHist MD WHERE MD.Product = SD.Product AND MD.Price <> SD.Price AND MD.StartDate BETWEEN SD.StartDate AND ED.EndDate) AND
    LMD.Product IS NULL
GROUP BY
    SD.Product,
    SD.Price,
    SD.StartDate
ORDER BY
    SD.StartDate

Upvotes: 2

Related Questions