Nuri Tasdemir
Nuri Tasdemir

Reputation: 9842

GroupBy with respect to record intervals on another table

I prepared a sql fiddle for my question. Here it is There is a working code here. I am asking whether there exists an alternative solution which I did not think.

CREATE TABLE [Product]
    ([Timestamp] bigint NOT NULL PRIMARY KEY, 
     [Value] float NOT NULL
    )
;

CREATE TABLE [PriceTable]
    ([Timestamp] bigint NOT NULL PRIMARY KEY, 
     [Price] float NOT NULL
    )
;

INSERT INTO [Product]
    ([Timestamp], [Value])
VALUES
    (1, 5),
    (2, 3),
    (4, 9),
    (5, 2),
    (7, 11),
    (9, 3)    
;

INSERT INTO [PriceTable]
    ([Timestamp], [Price])
VALUES
    (1, 1),
    (3, 4),
    (7, 2.5),
    (10, 3)    
;

Query:

SELECT [Totals].*, [PriceTable].[Price] 
FROM 
(
    SELECT [PriceTable].[Timestamp]
           ,SUM([Value]) AS [TotalValue]
    FROM [Product], 
         [PriceTable]
    WHERE [PriceTable].[Timestamp] <= [Product].[Timestamp]
     AND NOT EXISTS (SELECT * FROM [dbo].[PriceTable] pt 
                     WHERE pt.[Timestamp] <= [Product].[Timestamp]
                       AND pt.[Timestamp] > [PriceTable].[Timestamp])
    GROUP BY [PriceTable].[Timestamp]
) AS [Totals]
INNER JOIN [dbo].[PriceTable]
        ON [PriceTable].[Timestamp] = [Totals].[Timestamp]
ORDER BY [PriceTable].[Timestamp]

Result

| Timestamp | TotalValue | Price |
|-----------|------------|-------|
| 1         | 8          | 1     |
| 3         | 11         | 4     |
| 7         | 14         | 2.5   |

Here, my first table [Product] contains the product values for different timestamps. And second table [PriceTable] contains the prices for different time intervals. A given price is valid until a new price is set. Therefore the price with timestamp 1 is valid for Products with timestamps 1 and 2.

I am trying to get the total number of products with respect to given prices. The SQL on the fiddle produces what I expect.

Is there a smarter way to get the same result?

By the way, I am using SQLServer 2014.

Upvotes: 3

Views: 87

Answers (3)

P.Salmon
P.Salmon

Reputation: 17640

You could use a cte

;with cte as
(
select  p1.[timestamp] as lowval,
        case
        when    p2.[timestamp] is not null then p2.[timestamp] - 1
        else    999999
        end     hival,
        p1.price
from
(
select  p1.[timestamp],p1.price,
        row_number() over (order by p1.[timestamp]) rn
from    pricetable p1 ) p1
left outer join 
(select p1.[timestamp],p1.price,
        row_number() over (order by p1.[timestamp]) rn
from    pricetable p1) p2 
on      p2.rn = p1.rn + 1
)
select  cte.lowval as 'timestamp',sum(p1.value) TotalValue,cte.price
from    product p1
join    cte on p1.[Timestamp] between cte.lowval and cte.hival
group   by cte.lowval,cte.price
order   by cte.lowval

It's a lot easier to understand and the execution plan compares favourably with your query (about 10%) cheaper

Upvotes: 0

Giorgi Nakeuri
Giorgi Nakeuri

Reputation: 35780

DECLARE @Product TABLE
    (
      [Timestamp] BIGINT NOT NULL
                         PRIMARY KEY ,
      [Value] FLOAT NOT NULL
    );

DECLARE @PriceTable TABLE
    (
      [Timestamp] BIGINT NOT NULL
                         PRIMARY KEY ,
      [Price] FLOAT NOT NULL
    );

INSERT  INTO @Product
        ( [Timestamp], [Value] )
VALUES  ( 1, 5 ),
        ( 2, 3 ),
        ( 4, 9 ),
        ( 5, 2 ),
        ( 7, 11 ),
        ( 9, 3 );

INSERT  INTO @PriceTable
        ( [Timestamp], [Price] )
VALUES  ( 1, 1 ),
        ( 3, 4 ),
        ( 7, 2.5 ),
        ( 10, 3 );

WITH    cte
          AS ( SELECT   * ,
                        LEAD(pt.[Timestamp]) OVER ( ORDER BY pt.[Timestamp] ) AS [lTimestamp]
               FROM     @PriceTable pt
             )
    SELECT  cte.[Timestamp] ,
            ( SELECT    SUM(Value)
              FROM      @Product
              WHERE     [Timestamp] >= cte.[Timestamp]
                        AND [Timestamp] < cte.[lTimestamp]
            ) AS [TotalValue],
            cte.[Price]
    FROM    cte

Idea is to generate intervals from price table like:

1 - 3
3 - 7
7 - 10

and sum up all values in those intervals.

Output:

Timestamp   TotalValue  Price
1           8           1
3           11          4
7           14          2.5
10          NULL        3

You can simply add WHERE clause if you want to filter out rows where no orders are sold.

Also you can indicate the default value for LEAD window function if you want to close the last interval like:

LEAD(pt.[Timestamp], 1, 100) 

and I guess it would be something like this in production:

LEAD(pt.[Timestamp], 1, GETDATE()) 

Upvotes: 1

Ash
Ash

Reputation: 6035

I think I've got a query which is easier to read. Does this work for you?

select pt.*, 
(select sum(P.Value) from Product P where 
 P.TimeStamp between pt.TimeStamp and (
 --get the next time stamp
 select min(TimeStamp)-1 from PriceTable where TimeStamp > pt.TimeStamp
 )) as TotalValue from PriceTable pt
 --exclude entries with timestamps greater than those in Product table
 where pt.TimeStamp < (select max(TimeStamp) from Product)

Very detailed question BTW

Upvotes: 0

Related Questions