Reputation: 9075
I have a table where every ActorId
has a Value
, but the Value
will change over time and we keep track of all changes to these values.
Example:
Id ActorId Value Timestamp
--------------------------------
1 5 10 2019-01-05
2 8 8 2019-01-07
3 8 5 2019-01-08
4 5 15 2019-01-11
5 3 12 2019-01-11
6 3 7 2019-01-12
7 3 14 2019-01-14
[...]
If we load everything into memory it's quite easy to check for every Actor
what the value will be at any moment in time (or it will be null / undefined if it has never been set before the query date). Matter of fact is that I actually need to send all this data to the browser and do this within JavaScript.
The problem is that the full table will become really, really large. Both in unique ActorIds and in value changes per ActorId.
Usually I just need a relative small selection of ActorId
s and a short period of time - say two months.
I could do something like:
SELECT *
FROM [ActorValue]
WHERE
ActorId IN (5, 8, 12, [...])
AND Timestamp >= '2019-02-01'
AND Timestamp <= '2019-04-01'
And send these results to JavaScript for further processing.
The problem is that I also need the value at the start of the period. So for every ActorId I need the values within the date range and one value-change before that.
What is the best way to solve this problem without loading all value changes into memory.
So what I'm in theory looking for is something like:
AND Timestamp >= '2019-02-01' -- plus the entry before this date, grouped by ActorId
Upvotes: 1
Views: 261
Reputation: 1269973
I would just use lead()
:
SELECT av.*
FROM (SELECT av.*,
LEAD(Timestamp) OVER (PARTITION BY ActorId ORDER BY Timestamp) as next_timestamp
FROM ActorValue av
WHERE ActorId IN (5, 8, 12, [...])
) av
WHERE next_Timestamp >= '2019-02-01'
Timestamp <= '2019-04-01';
Upvotes: 1
Reputation: 222492
I think that what this can be solved by adding more conditions in the WHERE
clause. Either the record belongs to the analysis interval, or it is the most recent record for before the start of the analysis period for the concerned actor. The latter condition can be expressed using a NOT EXISTS
condition and a correlated subquery.
Try:
SELECT *
FROM [ActorValue] a
WHERE
ActorId IN (5, 8, 12, [...])
AND
(
(
Timestamp >= '2019-02-01'
AND Timestamp <= '2019-04-01'
)
OR (
Timestamp < '2019-02-01'
AND NOT EXISTS (
SELECT 1 FROM [ActorValue] a1
WHERE
a1.ActorId = a.ActorId
AND a1.Timestamp < '2019-02-01'
AND a1.Timestamp > a.Timestamp
)
)
Upvotes: 1