Surya
Surya

Reputation: 2699

Postgres: If we select a computed column multiple times, will Postgres compute it again and again?

Here is the query that I am trying,

SELECT s.id, s.name AS name,
CASE WHEN (ARRAY_AGG(tgs.status) @> '{Hard} ') THEN 'Hard'
WHEN (ARRAY_AGG(tgs.status) @> '{Soft} ') THEN 'Soft'
WHEN (ARRAY_AGG(tgs.status) @> '{Fluid} ') THEN 'Fluid'
WHEN (ARRAY_AGG(tgs.status) @> '{Gummy} ') THEN 'Gummy'
WHEN (ARRAY_AGG(tgs.status) @> '{Expired} ') THEN 'Expired'
END AS status, 
COUNT(*) OVER()
FROM sweets AS s 
INNER JOIN tasty_goofy_sweets AS tgs on tgs.sweet_id = s.id
GROUP BY s.id;

While implementing this my friend suggested that, instead of computing array_agg every time in the switch case, we could use LEFT JOIN LATERAL and compute it just once. i.e) to implement like below

SELECT s.id, s.name AS name,
CASE WHEN (tgs.status @> '{Hard} ') THEN 'Hard'
WHEN (tgs.arr_status @> '{Soft} ') THEN 'Soft'
WHEN (tgs.arr_status @> '{Fluid} ') THEN 'Fluid'
WHEN (tgs.arr_status @> '{Gummy} ') THEN 'Gummy'
WHEN (tgs.arr_status @> '{Expired} ') THEN 'Expired'
END AS status, 
COUNT(*) OVER()
FROM sweets AS s 
LEFT JOIN LATERAL ( SELECT ARRAY_AGG(tgs.status) AS arr_status FROM tasty_goofy_sweets tgs WHERE  tgs.sweet_id = s.id
) AS tgs ON TRUE
GROUP BY s.id;

But I am not sure if Postgres computes the ARRAY_AGG value every time, how can we decide which approach is better? I tried looking at explain analyse for both the queries, the number of rows involved in the latter query is more than the former. But I don't understand why this is so?

I intuitively feel the former approach is better, but can someone please reason out, which is better and why or am I asking too much ?

Upvotes: 2

Views: 807

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269743

I am fairly certain that in a case expression, the when clause is going to be evaluated separately for each condition. That means that your colleague is correct . . . probably.

The operative part of the documentation is:

Each condition is an expression that returns a boolean result. If the condition's result is true, the value of the CASE expression is the result that follows the condition, and the remainder of the CASE expression is not processed.

It is possible that Postgres does do some sort of optimization of subexpressions by evaluating them once. However, the statement: "the remainder of the CASE expression is not processed" is pretty strong and suggests that each clause will only be processed after the previous ones have evaluated to false (or NULL).

Regardless of whether the optimizer picks figures out that a function can be called only once, the lateral join guarantees that it will be evaluated once, so that seems like the safer solution for an expensive operation.

Upvotes: 1

GMB
GMB

Reputation: 222462

Most likely, Postgres will optimize away the multiple array_agg()s, compute it only once and reuse the results in each comparison. That's quite simple query optimization, that the database should easily spot.

Let me suggest, however, to simplify the query by using conditional aggreagation. Yo don't neeed to aggregate into an array just to check if a given value is there:

select
    s.id,
    s.name
    case 
        when count(*) filter(where status = 'Hard')    > 0 then 'Hard',
        when count(*) filter(where status = 'Soft')    > 0 then 'Soft',
        when count(*) filter(where status = 'Fluid')   > 0 then 'Fluid'
        when count(*) filter(where status = 'Gummy')   > 0 then 'Gummy',
        when count(*) filter(where status = 'Expired') > 0 then 'Expired'
    end status,
    count(*) over() cnt
from sweets s
inner join tasty_goofy_sweets AS tgs on tgs.sweet_id = s.id
group by s.id;

You could also express this without aggregation, using a lateral join and a conditional sort:

select
    s.id,
    s.name,
    tgs.status,
    count(*) over() cnt
from sweets s
cross join lateral (
    select status
    from tasty_goofy_sweets as tgs 
    where tgs.sweet_id = s.id
    order by case status 
        when 'Hard'    then 1
        when 'Soft'    then 2
        when 'Fluid'   then 3
        when 'Gummy'   then 4
        when 'Expired' then 5
    end
    limit 1
) tgs

Upvotes: 2

Related Questions