Dante
Dante

Reputation: 796

Depth-first pre-order traversal of a tree

Look at this closure table:

ancestor descendant path_length
1 1 0
2 2 0
3 3 0
4 4 0
2 4 1
5 5 0
2 5 1
6 6 0
4 6 1
2 6 2
7 7 0
4 7 1
2 7 2
8 8 0
6 8 1
4 8 2
2 8 3

enter image description here

Now I want this in order:

1
2
4
6
8
7
5
3

Notice that all of a node's ancestors may not have a lower node number. Is it possible with a SQL query?

My attempt: Using PostgreSQL documentation section 7.8.2.1. Search Order, I found the following solution:

WITH RECURSIVE search_tree(descendant, path) AS (
SELECT descendant, ARRAY[ROW(ct.ancestor, ct.descendant)]
FROM closure_table ct WHERE descendant = 2
UNION ALL
SELECT
ct.descendant, path || ROW(ct.ancestor, ct.descendant)
FROM closure_table ct, search_tree st
WHERE ct.ancestor = st.descendant AND ct.path_length = 1
)
SELECT * FROM search_tree ORDER BY path;

which you can see here. But I don't know how efficient it is.

Upvotes: 3

Views: 179

Answers (1)

lemon
lemon

Reputation: 15482

Step 1: Finding the roots of your tree

Given your input table, you can do this by selecting

  • all ancestors with "path_length = 0" (to select them only once)
  • that are not found among the descendants with "path_length > 0" (those nodes that are found at least from level = 1 going on).
SELECT ancestor AS root FROM tab WHERE path_length = 0
EXCEPT
SELECT descendant FROM tab WHERE path_length > 0
root
1
2
3

Step 2: Implementing the Depth-First Search for you binary tree.

This can be done by

  • scanning rows whose "ancestor" value only belongs to the roots table (to avoid duplicate "descendant" values) by joining previous table
  • applying the depth-first ordering.

Depth-first ordering will order based on:

  • ancestors first
  • the first son of each child in a recursive (depth-first) way, using a ranking value from the ROW_NUMBER window function,
  • the path_length ascendently when it's going deep towards the leaves, the same path_length descendently when it's going upper towards the roots, using a CASE construct to handle the two situations.
WITH roots AS (
    SELECT ancestor AS root FROM tab WHERE path_length = 0
    EXCEPT
    SELECT descendant FROM tab WHERE path_length > 0
), ranked_nodes AS (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY ancestor, path_length
                                ORDER     BY descendant           ) AS rn
    FROM tab
    INNER JOIN roots
            ON tab.ancestor = roots.root
)
SELECT descendant
FROM ranked_nodes
ORDER BY ancestor, 
         rn, 
         CASE WHEN rn = 1 THEN path_length ELSE -path_length END

Check the demo here.


The upper one is a generalized solution, though if you assume knowing in advance the values of your roots (1, 2 and 3), you can simplify the query as follows:

WITH ranked_nodes AS (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY ancestor, path_length
                                ORDER     BY descendant           ) AS rn
    FROM tab
    WHERE ancestor <= 3
)
SELECT descendant
FROM ranked_nodes
ORDER BY ancestor, 
         rn, 
         CASE WHEN rn = 1 THEN path_length ELSE -path_length END

Upvotes: 1

Related Questions