Arturo Sbr
Arturo Sbr

Reputation: 6333

Recursive CTE to climb a hierarchy

I have a table called connections that shows mirrored user connections.

For example:

The resulting table looks like this:

user_id connected_to
1 2
2 1
2 3
3 2
4 6
6 4
5 6
6 5

I want to form sub-networks by joining all users that have other users in common and assign each of them the smallest user_id.

The result I'm looking for is:

user_id subnetwork_id
1 1
2 1
3 1
4 4
5 4
6 4

The logic is as follows:

  1. Users 1, 2 and 3 are all connected through 2. user_id 1 is the smallest of all three, so subnetwork_id should equal 1.
  2. Users 4, 5 and 6 are all connected through 6. user_id 4 is the smallest of all three, so subnetwork_id should be 4.

I wrote a recursive query (will post in future edit) that self-joins the connections table with itself on user_id and connected_to.

My query does not work with the second example (sub-network 4, 5 and 6) because the greatest user_id is the one that connects them all together.

How can I go from table connections to the desired result?

Upvotes: 0

Views: 73

Answers (2)

Alexey
Alexey

Reputation: 2479

Here is a solution using a recursive CTE.

The idea is to build a network for each node in a subquery while aggregating the user_id values of connected nodes in an array (network), which is used to stop recursion since the original connections are mirrored. Then the minimum user_id value from all interconnected nodes is used as a network id.

WITH users AS (
    SELECT DISTINCT user_id FROM connections
)
SELECT 
    u.user_id,
    (
        WITH RECURSIVE cte AS (
            SELECT 
                c.user_id,
                ARRAY[c.user_id] AS network
            FROM connections c
            WHERE c.user_id = u.user_id
            UNION ALL
            SELECT 
                c.user_id, 
                ARRAY[c.user_id] || cte.network AS network
            FROM connections c                
            JOIN cte ON c.connected_to = cte.user_id 
                        AND NOT (ARRAY[c.user_id] && cte.network)
        )
        SELECT MIN(user_id) FROM cte
    ) AS subnetwork_id
FROM users u
ORDER BY user_id, subnetwork_id

Query output

user_id subnetwork_id
1 1
2 1
3 1
4 4
5 4
6 4

You can check a working demo here

Upvotes: 1

h8red
h8red

Reputation: 713

I was using your first version (without mirroring), and assuming the following query should return all the connections in a single tree (assuming no loops):

SELECT LEAST(min(user_id), min(connected_to)) FROM (WITH RECURSIVE connect_rec AS (
    SELECT
        user_id,
        connected_to
    FROM
        connections
    WHERE
        connected_to = 1 OR user_id = 1
    UNION
        SELECT
            cc.user_id,
            cc.connected_to
        FROM
            connections cc
        INNER JOIN connect_rec rec ON (cc.connected_to = rec.user_id or cc.user_id = rec.connected_to or cc.user_id = rec.user_id or cc.connected_to = rec.connected_to)
) SELECT
    *
FROM
    connect_rec) full_connections
    

you can do a for loop for any given range (i'm just using 1..10 here, you can select until max of all values):

do $$
declare f record;
begin
   for counter in 1..10 loop
            for f in (SELECT LEAST(min(user_id), min(connected_to)) as min_tree FROM (WITH RECURSIVE connect_rec AS (
                SELECT user_id, connected_to
                FROM connections
                WHERE connected_to = counter OR user_id = counter
                UNION
                    SELECT cc.user_id, cc.connected_to
                    FROM connections cc
                    INNER JOIN connect_rec rec ON (cc.connected_to = rec.user_id or cc.user_id = rec.connected_to or cc.user_id = rec.user_id or cc.connected_to = rec.connected_to)) 
                    SELECT * FROM connect_rec) full_connections)
            loop 
                raise notice '% - min %', counter, f.min_tree;
            end loop;   
   end loop;
end; $$

the result is:

> notice:  1 - min 1
> notice:  2 - min 1
> notice:  3 - min 1
> notice:  4 - min 4
> notice:  5 - min 4
> notice:  6 - min 4
> notice:  7 - min <NULL>
> notice:  8 - min <NULL>
> notice:  9 - min <NULL>
> notice:  10 - min <NULL>
> OK
> Time: 0,004s

Upvotes: 1

Related Questions