Mario Toffia
Mario Toffia

Reputation: 530

How to use window lag function to partition the data with multiple sensor types

I'm, trying to use the postgres LAG function to calculate a difference in absolute time between two sampling dates for each sensor in the system.

The best I have come up with is to use WINDOW LAG function to calculate a difference from earlier line. However there are two problems with this. The first value is always null. I'd guess I can solve this using a CASE function. Second is that it does not take each sensor value explicitly.

SELECT 
    seq_id, stream_id, sensor, "timestamp", oper_value 
    "timestamp" - LAG("timestamp",1) OVER (ORDER BY "timestamp") delta
    FROM public.mt_events
    where "type" = 'operational_value_event'
    limit 100;

The data is logged when threshold breached (including hysteresis) and therefore some values are changed more often than other. There are no fixed intervals in when the data is logged.

The end goal to to get a value that have atleast a certain amount of time difference.

Example data:

id stream_id sensor oper_value timestamp
44 100000000    GT1         17 2018-05-16 13:36:21.899821+00
45 100000000    GT2         44 2018-05-16 14:36:21.000000+00
88 100000000    GT1         26 2018-05-18 12:33:22.000000+00
94 100000000    GT1         99 2018-05-18 12:33:23.002000+00

For example if a selection of data with at least time difference of 5 minutes I would like to get the following values:

id stream_id sensor oper_value timestamp
44 100000000    GT1         17 2018-05-16 13:36:21.899821+00
45 100000000    GT2         44 2018-05-16 14:36:21.000000+00
88 100000000    GT1         26 2018-05-18 12:33:22.000000+00

The last GT1 is filtered out since the difference was less than 5 minutes.

Is there any way of doing this effectively using a SQL statement or do I need to write a stored procedure?

Cheers, Mario

Upvotes: 2

Views: 273

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270463

You can use the three-argument form of lag() with partition by:

("timestamp" -
 LAG("timestamp", 1, "timestamp") OVER (PARTITION BY sensor ORDER BY "timestamp")
) as delta

For your ultimate problem, the NULL value for the first row doesn't matter. You can solve the problem using a subquery:

select *
from (select seq_id, stream_id, sensor, "timestamp", oper_value ,
             lag("timestamp") over (partition by sensor order by timestamp) as prev_timestamp
      from public.mt_events
      where "type" = 'operational_value_event'
     ) t
where delta is null or
      prev_timestamp < timestamp - interval '5 minute';

Upvotes: 1

Related Questions