Keldeo
Keldeo

Reputation: 183

Group By with Window Function instead of two Window Functions

I have a table with 2 columns: time and id. We think of the rows as sorted first by id, then by time.

╔════════╦══════════╗
║ time   ║ id       ║
╠════════╬══════════╣
║ 9:10   ║  1       ║
║ 9:20   ║  1       ║
║ 10:10  ║  1       ║
║ 11:30  ║  1       ║
║ 11:50  ║  1       ║
║ 10:20  ║  2       ║
║ 10:30  ║  2       ║
║ 11:20  ║  3       ║
║ 11:50  ║  3       ║
╚════════╩══════════╝

I want to select from it only the rows where their id is the same as the 'previous' row id and the time difference from the previous row is less than an hour.

This can be done by first creating a table where there is a 3rd column of the time differences to the previous row and a 4th column of the id differences, and only selecting the rows where the id_diff is 0 and the time_diff is above 1 hour.

But this method seems inelegant because I want to look at each id separately and inside each id to look at its times and check if the consecutive difference is above an hour. That'll reflect better the logic of looking at each id separately, because they are different entities.

So how can it be done with grouping on the id, instead of using window functions twice? I'm aware of GROUP BY's existence.

The code that works, with two window functions:

SELECT auxiliary_table_with_lag_diffs.*
FROM (
    select info.*,
        time-lag(time) over (Order by id, time ) as diff_time,
        id-lag(id) over (order by id, time) as diff_id
    from info
    )auxiliary_table_with_lag_diffs
WHERE diff_time>'01:00:00'
    AND diff_id=0
ORDER BY id, time;

Upvotes: 0

Views: 317

Answers (2)

Thorsten Kettner
Thorsten Kettner

Reputation: 94859

You only need one analytic function call here: Get the previous time within the same ID.

SELECT *
FROM 
(
  select info.*,
    time - lag(time) over (partition by id order by time) as diff_time
  from info
) auxiliary_table_with_lag_diffs
WHERE diff_time > interval '1 hour';
ORDER BY id, time;

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269445

If you want to look at the previous row, then why are you ordering by id first?

SELECT i.*
FROM (select i.*, lag(time) over (order by time) as prev_time,
             lag(id) over (order by time) as prev_id
     from info i
     ) i 
WHERE time < prev_time + interval '1 hour' and id = prev_id
ORDER BY id, time;

If you don't want prev_time and prev_id in the output, just explicitly select the columns that you do want.

NOTE: You might want > rather than < depending on your actual needs (the question is ambiguous).

Upvotes: 1

Related Questions