user1071182
user1071182

Reputation: 1627

Postgresql view with many common table expressions is slow

This is a huge simplification of my query, but essentially I have a series of common table expressions that build off of each other which I would like to turn into a view. The problem is it's extremely slow when I try to use a view, but very fast when I run the query.

CREATE VIEW user_view AS
WITH cte AS(
  SELECT first,middle,last FROM user
),
cte2 AS(
  SELECT *,first + middle AS first_middle FROM cte
),
cte3 AS(
  SELECT *,first_middle + last AS full_name FROM cte2
)
  SELECT * from cte3;

Fast query

WITH cte AS(
  SELECT first,middle,last FROM user WHERE user_id = 5
),
cte2 AS(
  SELECT *,first + middle AS first_middle FROM cte
),
cte3 AS(
  SELECT *,first_middle + last AS full_name FROM cte2
)
  SELECT * from cte3;

Slow query using the view

SELECT * from user_view WHERE user_id = 5

Upvotes: 1

Views: 1188

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269693

Postgres implements something called an "optimization fence" for CTEs. That means that Postgres materializes each CTE for subsequent processing. One nice effect is that a CTE can be referenced multiple times, but the code is only executed once. The downside is that conveniences such as indexes are "forgotten" after the CTE has been materialized.

For your question, the view is actually immaterial (no pun intended). In this version:

WITH cte AS (
      SELECT first, middle, last FROM user WHERE user_id = 5
     ),
     cte2 AS (
      SELECT *, first || middle AS first_middle FROM cte
     ),
     cte3 AS (
      SELECT *, first_middle || last AS full_name FROM cte2
    )
SELECT * 
FROM cte3;

The first CTE presumably pulls one record out from the table. Presumably, it uses an index on the id and even that operation is very fast. That one record is the only record processed by the remaining CTEs.

In this version:

WITH cte AS (
      SELECT first, middle, last FROM user 
     ),
     cte2 AS (
      SELECT *, first || middle AS first_middle FROM cte
     ),
     cte3 AS (
      SELECT *, first_middle || last AS full_name FROM cte2
    )
SELECT * 
FROM cte3
WHERE user_id = 5;

The CTEs are processing all the data in the user table. At the end, the row meeting the WHERE condition needs to be found. The materialized CTE no longer has an index . . . so the data is searched sequentially.

This behavior does not apply to subqueries, so you can try rewriting your logic using subqueries rather than CTEs.

Postgres optimizes CTEs differently from other databases. For instance, SQL Server never materializes subqueries; the code is always "inserted" into the query and optimized as a whole. In fact, SQL Server forums have the opposite concern -- to implement an option to materialize the CTEs. is different from other databases. Oracle is one database that seems to take both approaches.

Upvotes: 6

Related Questions