Gayathri
Gayathri

Reputation: 1896

Sort tables in order of dependency - Postgres

Purpose is to allow a insert script to add data to all tables in schema, such that it doesn't create any conflict in constraint. i take tables from information_schema.tables and take constraints from information_schema.table_constraints but not sure how to compare to sort tables in order of foreign key constraints. Please favour.

The below produces duplicate table names:

select a.table_name,b.ordinal_position
from information_schema.tables a left outer join
     information_schema.key_column_usage b
     on a.table_name = b.table_name

Upvotes: 3

Views: 3705

Answers (2)

user330315
user330315

Reputation:

You need a recursive query that traverses the whole dependency tree of foreign key relations.

The following query does this for simple dependencies. It does not handle circular foreign keys

with recursive fk_tree as (
  -- All tables not referencing anything else
  select t.oid as reloid, 
         t.relname as table_name, 
         s.nspname as schema_name,
         null::text COLLATE "default" as referenced_table_name,
         null::text COLLATE "default" as referenced_schema_name,
         1 as level
  from pg_class t
    join pg_namespace s on s.oid = t.relnamespace
  where relkind = 'r'
    and not exists (select *
                    from pg_constraint
                    where contype = 'f'
                      and conrelid = t.oid)
    and s.nspname = 'public' -- limit to one schema 
    
  union all 
  
  select ref.oid, 
         ref.relname, 
         rs.nspname,
         p.table_name,
         p.schema_name,
         p.level + 1
  from pg_class ref
    join pg_namespace rs on rs.oid = ref.relnamespace
    join pg_constraint c on c.contype = 'f' and c.conrelid = ref.oid
    join fk_tree p on p.reloid = c.confrelid
  where ref.oid != p.reloid  -- do not enter to tables referencing theirselves.
), all_tables as (
  -- this picks the highest level for each table
  select schema_name, table_name,
         level, 
         row_number() over (partition by schema_name, table_name order by level desc) as last_table_row
  from fk_tree
)
select schema_name, table_name, level
from all_tables at
where last_table_row = 1
order by level;

For the following table structure:

create table customer (id integer primary key);
create table product (id integer primary key);
create table manufacturer (id integer primary key);
create table manufactured_by (product_id integer references product, manufacturer_id integer references manufacturer);
create table distributor (id integer primary key);
create table orders (id integer primary key, customer_id integer references customer);
create table order_line (id integer primary key, order_id integer references orders);
create table invoice (id integer, order_id integer references orders);
create table delivery (oder_line_id integer references order_line, distributor_id integer references distributor);

This returns the following result:

schema_name  | table_name      | level
-------------+-----------------+------
public       | customer        |     1
public       | distributor     |     1
public       | manufacturer    |     1
public       | product         |     1
public       | manufactured_by |     2
public       | orders          |     2
public       | order_line      |     3
public       | invoice         |     3
public       | delivery        |     4

Which means you need to insert into customer before you can insert into orders. The order of inserts for tables with the same level does not matter.

The intermediate step with the CTE all_tables is necessary to list every table only once. Otherwise the table manufactured_by would be listed twice:

schema_name | table_name      | level
------------+-----------------+------
public      | customer        |     1
public      | product         |     1
public      | manufacturer    |     1
public      | distributor     |     1
public      | manufactured_by |     2
public      | manufactured_by |     2
public      | orders          |     2
public      | delivery        |     2
public      | order_line      |     3
public      | invoice         |     3
public      | delivery        |     4

This query might need more tweaking (especially to prevent cycles in the dependency tree), but it should give you a start.

Upvotes: 14

T.S.
T.S.

Reputation: 19376

The original answer from @user330315 was "almost there" for me. I was getting

ERROR: recursive query "fk_tree" column 4 has collation "default" in non-recursive term but collation "C" overall LINE 6: null::text as referenced_table_name, ^ HINT: Use the COLLATE clause to set the collation of the non-recursive term.

Probably depends on default collation setting. To fix, just add specified collation to null static columns

with recursive fk_tree as (
  -- All tables not referencing anything else
  select t.oid as reloid, 
         t.relname as table_name, 
         s.nspname as schema_name,
         null::text  COLLATE "en_US" as referenced_table_name, -- <<<-----
         null::text  COLLATE "en_US" as referenced_schema_name, -- <<<-----
         1 as level
. . . . . . . . . . 

Upvotes: 3

Related Questions