Postgres: If we select a computed column multiple times, will Postgres compute it again and again?

Question

Here is the query that I am trying,

SELECT s.id, s.name AS name,
CASE WHEN (ARRAY_AGG(tgs.status) @> '{Hard} ') THEN 'Hard'
WHEN (ARRAY_AGG(tgs.status) @> '{Soft} ') THEN 'Soft'
WHEN (ARRAY_AGG(tgs.status) @> '{Fluid} ') THEN 'Fluid'
WHEN (ARRAY_AGG(tgs.status) @> '{Gummy} ') THEN 'Gummy'
WHEN (ARRAY_AGG(tgs.status) @> '{Expired} ') THEN 'Expired'
END AS status, 
COUNT(*) OVER()
FROM sweets AS s 
INNER JOIN tasty_goofy_sweets AS tgs on tgs.sweet_id = s.id
GROUP BY s.id;

While implementing this my friend suggested that, instead of computing array_agg every time in the switch case, we could use LEFT JOIN LATERAL and compute it just once. i.e) to implement like below

SELECT s.id, s.name AS name,
CASE WHEN (tgs.status @> '{Hard} ') THEN 'Hard'
WHEN (tgs.arr_status @> '{Soft} ') THEN 'Soft'
WHEN (tgs.arr_status @> '{Fluid} ') THEN 'Fluid'
WHEN (tgs.arr_status @> '{Gummy} ') THEN 'Gummy'
WHEN (tgs.arr_status @> '{Expired} ') THEN 'Expired'
END AS status, 
COUNT(*) OVER()
FROM sweets AS s 
LEFT JOIN LATERAL ( SELECT ARRAY_AGG(tgs.status) AS arr_status FROM tasty_goofy_sweets tgs WHERE  tgs.sweet_id = s.id
) AS tgs ON TRUE
GROUP BY s.id;

But I am not sure if Postgres computes the ARRAY_AGG value every time, how can we decide which approach is better? I tried looking at explain analyse for both the queries, the number of rows involved in the latter query is more than the former. But I don't understand why this is so?

I intuitively feel the former approach is better, but can someone please reason out, which is better and why or am I asking too much ?

Gordon Linoff · Accepted Answer

I am fairly certain that in a case expression, the when clause is going to be evaluated separately for each condition. That means that your colleague is correct . . . probably.

The operative part of the documentation is:

Each condition is an expression that returns a boolean result. If the condition's result is true, the value of the CASE expression is the result that follows the condition, and the remainder of the CASE expression is not processed.

It is possible that Postgres does do some sort of optimization of subexpressions by evaluating them once. However, the statement: "the remainder of the CASE expression is not processed" is pretty strong and suggests that each clause will only be processed after the previous ones have evaluated to false (or NULL).

Regardless of whether the optimizer picks figures out that a function can be called only once, the lateral join guarantees that it will be evaluated once, so that seems like the safer solution for an expensive operation.

Postgres: If we select a computed column multiple times, will Postgres compute it again and again?

Answers (2)

Related Questions