realdark
realdark

Reputation: 749

How to optimize this select

SELECT *
FROM   trips,
       dates
WHERE  places_number_on_flag > 0
       AND places_number > 0
       AND ( places_number - ( (SELECT Count(id)
                                FROM   resrv_customers
                                WHERE  trip = trips.id
                                       AND ow > 0)
                               + (SELECT Count(id)
                                  FROM   resrv_customers
                                  WHERE  trip = trips.id
                                         AND extra_seat = 1
                                         AND ow > 0) ) < 20 )
       AND dates.id = trips.trip_date
       AND dates.from_date > 2458553;  

It returns 5 rows after about a minute and 50 seconds.

I want it to be faster.

Upvotes: 0

Views: 80

Answers (3)

Michael Buen
Michael Buen

Reputation: 39393

Try this:

with rc_count as
(
    select trip, count(*) as ow_count, sum( (extra_seat = 1)::int ) as ow_count_with_extra_seat
    from resrv_customers
    where ow > 0
    group by trip    
)

SELECT *
FROM   trips
join   dates on dates.id = trips.trip_date
join   rc_count rc on trips.id = trip 
WHERE  places_number_on_flag > 0
       AND places_number > 0
       AND dates.from_date > 2458553
       and places_number - (ow_count + ow_count_with_extra_seat) < 20       

Upvotes: 0

user330315
user330315

Reputation:

One optimization would be to only run the sub-query once and use conditional aggregation to count the seats:

SELECT *
FROM   trips
  JOIN dates on dates.id = trips.trip_date
WHERE  places_number_on_flag > 0
       AND places_number > 0
       AND ( places_number - ( (SELECT Count(id) + count(id) filter (where extra_seat = 1)
                                FROM   resrv_customers
                                WHERE  trip = trips.id
                                       AND ow > 0)) < 20 )
       AND dates.from_date > 2458553;  

Note that I replaced your ancient and fragile implicit joins in the WHERE clause with an explicit JOIN operator. It does not have any impact on performance, it's just better coding style.

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521239

We can try rewriting your query using a join to the resrv_customers table, instead of using costly correlated subqueries:

WITH cte AS (
    SELECT
        trip,
        COUNT(CASE WHEN ow > 0 THEN 1 END) AS cnt1,
        COUNT(CASE WHEN extra_seat = 1 AND ow > 0 THEN 1 END) AS cnt2
    FROM resrv_customers
    GROUP BY trip
)

SELECT *
FROM trips t
INNER JOIN dates d
    ON t.trip_date = d.id
LEFT JOIN cte r
    ON r.trip = t.id
WHERE
    places_number > 0 AND
    places_number - (r.cnt1 + r.cnt2) < 20 AND
    d.from_date > 2458553;

Indexing here could be tricky, because you are doing a SELECT *. To get Postgres to use any index, you might have to do a lot of column covering, i.e. creating large indices.

Upvotes: 0

Related Questions