zemirco
zemirco

Reputation: 16395

PostgreSQL: How to find all dependencies recursively for a given package?

I'm having the following relation: A package has multiple versions.

CREATE TABLE IF NOT EXISTS package (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL UNIQUE
);

CREATE TABLE IF NOT EXISTS version (
  id SERIAL PRIMARY KEY,
  package_id INT NOT NULL REFERENCES package,
  version TEXT NOT NULL,
  
  UNIQUE(package_id, version)
);

A version has multiple dependencies. Those dependencies are again versions of other packages.

CREATE TABLE IF NOT EXISTS dependency (
  version_id INT NOT NULL REFERENCES version,
  dependency_id INT NOT NULL REFERENCES version
);

Take the npm registry for example from the JavaScript/Node.js ecosystem. You can generate a dependency tree using the following command.

npm list --prod

That lists all production dependencies.

├─┬ [email protected]
│ ├── @babel/[email protected] deduped
│ ├─┬ [email protected]
│ │ ├── @babel/[email protected] deduped
│ │ ├── [email protected] deduped
│ │ ├── [email protected]
│ │ ├── [email protected] deduped
│ │ ├── [email protected] deduped
│ │ └── [email protected]
│ ├── [email protected] deduped
│ ├─┬ [email protected]
│ │ ├── [email protected] deduped
│ │ ├── [email protected] deduped
│ │ └── [email protected]
│ ├─┬ [email protected]
│ │ ├── @babel/[email protected] deduped
│ │ ├── [email protected] deduped
│ │ ├── [email protected] deduped
│ │ ├── [email protected] deduped
│ │ ├─┬ [email protected]
│ │ │ ├── @babel/[email protected] deduped
│ │ │ ├── [email protected] deduped
│ │ │ ├── [email protected] deduped
│ │ │ └── [email protected] deduped
│ │ ├─┬ [email protected]
│ │ │ └── [email protected]
│ │ ├── [email protected] deduped
│ │ ├── [email protected]
│ │ ├── [email protected] deduped
│ │ ├── [email protected] deduped
│ │ └── [email protected] deduped
│ ├── [email protected] deduped
│ ├── [email protected]
│ └── [email protected]

As you can see in the example [email protected] depends on [email protected] which depends on [email protected].

The idea is to have all packages and their versions in a database and to find a query that lists all dependencies for a given package. I'd like to find the query that (more or less) gives me the following result.

| name             | version | level |
--------------------------------------
| react-router-dom | 5.2.0   | 1     |
| @babel/runtime   | 7.13.17 | 2     |
| history          | 4.10.1  | 2     |
| @babel/runtime   | 7.13.17 | 3     |
| loose-envify     | 1.4.0   | 3     |
| resolve-pathname | 3.0.0   | 3     |
| ...
| prop-types       | 15.7.2  | 2     |
| ...              | ...     | ...   |

I created a little sql fiddle with some dummy data that can be used as a playground - http://sqlfiddle.com/#!17/633dd.

I think recursive common table expressions are the way to go but I'm a bit lost since I haven't used them yet. I found the typical example where people have managers and you're able generate the organization chart but I wasn't able to transfer this to my problem.

WITH RECURSIVE cte_name AS (
    CTE_query_definition -- non-recursive term
    UNION [ALL]
    CTE_query definion  -- recursive term
) SELECT * FROM cte_name;

Thank you very much for your help!

Upvotes: 1

Views: 683

Answers (1)

Ajax1234
Ajax1234

Reputation: 71451

Below is a cte which produces all the dependences for a package (in this case the package with a version_id of 1):

with recursive cte(v_id, d_id, l) as (
   select d.*, 1 from dependency d where d.version_id = 1
   union all
   select case when d.version_id is null then c.d_id else d.version_id end, 
         case when d.dependency_id is null then null else d.dependency_id end, 
         c.l+1 
   from cte c left join dependency d on c.d_id = d.version_id where c.d_id  is not null
)
select v.package_id, p.name, v.version, c.l 
from cte c join version v on v.id = c.v_id join package p on p.id = v.package_id 
group by v.package_id, p.name, v.version, c.l
order by v.package_id;

See demo here.

Upvotes: 2

Related Questions