Reputation: 44735
I'm fairly new to Postgres, and I'm trying to implements some complex functionality.
The following query works:
select (
select ts as start
from (
select *
from events
where (reason = 'START' or reason = 'STOP')
and ts < ?
and user = ?
order by ts desc
limit 1
) as subquery_start
where reason = 'START'
),(
select ts as stop
from (
select *
from events
where (reason = 'START' or reason = 'STOP')
and ts > ?
and user = ?
order by ts
limit 1
) as subquery_stop
where reason = 'STOP'
);
It finds whether the user, at a time, is between START and STOP events, returning:
start | stop
---------------------------+---------------------------
2018-06-01 10:44:55.52+01 | 2018-06-01 10:45:07.52+01
(1 row)
whether they are not:
start | stop
-------+------
|
(1 row)
or whether they're just after a START and a later STOP is not yet present:
start | stop
---------------------------+------
2018-06-01 10:44:55.52+01 |
(1 row)
Is it possible to simplify such a query, given that I want one row returned, as in the examples above?
Will three levels of nested selects cause performance issues?
Upvotes: 0
Views: 104
Reputation: 28253
use window functions to avoid nesting altogether
SELECT
CASE WHEN FIRST_VALUE(reason) OVER w1 = 'START'
THEN FIRST_VALUE(ts) OVER w1
ELSE NULL
END start,
CASE WHEN LAST_VALUE(reason) OVER w2 = 'STOP'
THEN LAST_VALUE(ts) OVER w2
ELSE NULL
END stop
FROM events
WHERE (reason = 'START' or reason = 'STOP')
AND ts > ? AND ts < ?
AND user = ?
WINDOW w1 AS (ORDER BY reason != 'START', ts)
w2 AS (ORDER BY reason = 'STOP', ts)
LIMIT 1
This works because Boolean values are sorted False
, True
.
So, w1
puts all the rows where reason = 'START'
at the top, and within that group sorts by ts. We pick out the ts
in the first row with FIRST_VALUE
. Similarly, we get the last ts
where reason = 'STOP'
This detects the following cases:
START
or STOP
reasons were found for user in the time range (no rows returned)START
or STOP
(but not both) reasons were found for user in the time range (row is returned with 1 null)START
ts
is after the last STOP
ts
(e.g. if the very last row ordered by ts is a only one where reason = 'START'
(comparison of the returned row's start
& stop
valuesSTART
ts
< STOP
ts
as expected. (row is returned with both columns properly populated)Upvotes: 0
Reputation: 12402
It looks pretty good to me. the inner selects being limit 1 will use any index you have on the 'user,ts' column-pair,
The best index for this query would be on
events(user,ts) where (reason = 'START' or reason = 'STOP')
but on events(user,ts)
should be almost as good.
possibly a clearer way to get a similar result would be
WITH subquery_before AS (
select ts,reason
from events
where (reason = 'START' or reason = 'STOP')
and ts < ?
and user = ?
order by ts desc
limit 1
),
subquery_after AS (
select ts,reason
from events
where (reason = 'START' or reason = 'STOP')
and ts > ?
and user = ?
order by ts
limit 1
)
SELECT
subquery_before.ts AS start,
subquery_after.ts AS stop
WHERE subquery_before.reason = 'START'
AND subquery_after.reason = 'STOP'
Upvotes: 1