Understanding steps of recursive CTE

Question

I am having trouble understanding how recursive CTE works, in terms of the intermediate steps and the working/temporary tables involved along the way.

Slightly adapting the example from PostgreSQL Documention:

WITH RECURSIVE t(n) AS (
    VALUES (1)
  UNION ALL
    SELECT n+1 FROM t WHERE n < 5
)
SELECT n FROM t;

Running this in Postgres (9.5), I get:

 n 
---
 1
 2
 3
 4
 5
(5 rows)

But why didn't we get more rows? For example

    SELECT n+1 FROM t WHERE n < 5

when n = 2, why doesn't the table t have two rows

---
 1
 2

and based on that generate

---
 2
 3

? If that were the case, the end result should have many duplicate values, such as 2 with the UNION ALL.

The relevant part of the documentation says the following about a "working table" and an "intermediate table", which is although descriptive not clear enough to me:

1.Evaluate the non-recursive term. ... Include all remaining rows in the result of the recursive query, and also place them in a temporary working table.

2.So long as the working table is not empty, repeat these steps:

a. Evaluate the recursive term, substituting the current contents of the working table for the recursive self-reference. ... Include all remaining rows in the result of the recursive query, and also place them in a temporary intermediate table.

b. Replace the contents of the working table with the contents of the intermediate table, then empty the intermediate table.

My questions are:

Can anyone please explain step-wise what is going on with the simple example above?

Also, I am trying to understand this recursive CTE from a programming perspective. Can anyone outline a skeleton for the algorithm in the above CTE to generate the sequence?

Paul A Jungwirth · Accepted Answer

Each time the second half of the CTE runs, it sees only the results of the previous run. So the initial run executes the top half and yields 1. The second run executes the bottom half. It sees t as containing 1, so it returns 2. The third run sees t as containing 2 (not 1 and 2, because it only sees the results of the previous run), so it returns 3.

The fourth run sees 3 and returns 4.

The fifth run sees 4 and returns 5.

The sixth run sees 5, but that is excluded by the WHERE clause, so it returns no rows. Returning no rows is the signal to stop.

So now the complete result of the CTE is 1, 2, 3, 4, 5, which is what everything outside the CTE sees.

Understanding steps of recursive CTE

Answers (2)

Related Questions