mhn
mhn

Reputation: 2750

SQL to find linking column across tables without foreign keys

I am trying to find table links using duplicate column names. Say i have the following tables

T1:

| Prod_ID | Cust_Id | Value |
| P1      | C1      | 1     |
| P2      | C2      | 2     |
| P3      | C3      | 3     |
| P4      | C4      | 4     |
| P5      | C5      | 5     |

T2:

| Prod_ID | Prod_Num |
| P1      | PN1      |
| P2      | PN2      |
| P3      | PN3      |
| P4      | PN4      |
| P5      | PN5      |

I rely on system tables to fetch table information. The data looks like

| tabname | colname  |
| T1      | Prod_ID  |
| T1      | Cust_Id  |
| T1      | Value    |
| T2      | Prod_ID  |
| T2      | Prod_Num |
| T3      | ....     |

If i want to find all tables with columns Prod_ID and Cust_ID, i could do the same using

SELECT tabname, count(*)
FROM syscat.columns
WHERE colname IN ('Prod_ID', 'Cust_Id') 
GROUP BY tabname
HAVING count(*) > 1

Now, when i want to find how two columns across tables are linked, the query is getting complex.

For example: To find how Cust_Id and Prod_Num are linked, the expected output would be something like

| tabname | colname  |
| T1      | Cust_id  |
| T1      | Prod_id  |
| T2      | Prod_id  |
| T2      | Prod_Num |

Suggesting that Prod_Id is contained in both tables and can be used to map Cust_Id and Prod_num. Is there a script for getting something like above?

Upvotes: 1

Views: 44

Answers (1)

Radim Bača
Radim Bača

Reputation: 10701

I would use self-joins for that.

SELECT c1.tabname, c2.colname joinCol, c3.tabname
FROM syscat.columns c1
JOIN syscat.columns c2 ON c1.tabname = c2.tabname
JOIN syscat.columns c3 ON c3.tabname != c2.tabname and c3.colname = c2.colname
JOIN syscat.columns c4 ON c4.tabname = c3.tabname and c3.colname = c2.colname
WHERE c1.colname = 'Cust_Id' and c4.colname = 'Prod_Num'

The output is the following:

tabname joinCol  tabname
---------------------------
T1      Prod_id  T2

which means that table t1 is joined with t2 using prod_id (cust_id and prod_num are on the input, therefore there is no need to have them on the output)

demo - it is SQL Server, however, JOIN will work in DB2 as well ;)

Upvotes: 1

Related Questions