Dirk Boer
Dirk Boer

Reputation: 9075

SQL get all values within a time range - AND the one before that

I have a table where every ActorId has a Value, but the Value will change over time and we keep track of all changes to these values.

Example:

Id  ActorId     Value   Timestamp
--------------------------------
1   5           10      2019-01-05
2   8           8       2019-01-07
3   8           5       2019-01-08
4   5           15      2019-01-11
5   3           12      2019-01-11
6   3           7       2019-01-12
7   3           14      2019-01-14
[...]

If we load everything into memory it's quite easy to check for every Actor what the value will be at any moment in time (or it will be null / undefined if it has never been set before the query date). Matter of fact is that I actually need to send all this data to the browser and do this within JavaScript.

The problem is that the full table will become really, really large. Both in unique ActorIds and in value changes per ActorId.

Usually I just need a relative small selection of ActorIds and a short period of time - say two months.

I could do something like:

SELECT *
FROM [ActorValue]
WHERE 
    ActorId IN (5, 8, 12, [...])
    AND Timestamp >= '2019-02-01' 
    AND Timestamp <= '2019-04-01'

And send these results to JavaScript for further processing.

The problem is that I also need the value at the start of the period. So for every ActorId I need the values within the date range and one value-change before that.

What is the best way to solve this problem without loading all value changes into memory.

So what I'm in theory looking for is something like:

 AND Timestamp >= '2019-02-01' -- plus the entry before this date, grouped by ActorId

Upvotes: 1

Views: 261

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269973

I would just use lead():

SELECT av.*
FROM (SELECT av.*, 
             LEAD(Timestamp) OVER (PARTITION BY ActorId ORDER BY Timestamp) as next_timestamp
      FROM ActorValue av
      WHERE ActorId IN (5, 8, 12, [...])
     ) av
WHERE next_Timestamp >= '2019-02-01' 
      Timestamp <= '2019-04-01';

Upvotes: 1

GMB
GMB

Reputation: 222492

I think that what this can be solved by adding more conditions in the WHERE clause. Either the record belongs to the analysis interval, or it is the most recent record for before the start of the analysis period for the concerned actor. The latter condition can be expressed using a NOT EXISTS condition and a correlated subquery.

Try:

SELECT *
FROM [ActorValue] a
WHERE 
    ActorId IN (5, 8, 12, [...])
    AND 
    (
        (
            Timestamp >= '2019-02-01' 
            AND Timestamp <= '2019-04-01'
        )
        OR (
            Timestamp < '2019-02-01' 
            AND NOT EXISTS (
                SELECT 1 FROM [ActorValue] a1 
                WHERE 
                    a1.ActorId = a.ActorId 
                    AND a1.Timestamp < '2019-02-01' 
                    AND a1.Timestamp > a.Timestamp
            )
        )

Upvotes: 1

Related Questions