Chrisvdberge
Chrisvdberge

Reputation: 1956

MySQL Combining data from 2 tables into 3rd table

I currently have 2 tables:

CrawlData
id (autoincrement), Source, Destination, and some more columns

Nodes
id (autoincrement), URL

The Nodes table contains the distinct Source values from CrawlData

Now I would like to have a table that is a kind of look up table that contains the ID's from Nodes instead of the texts in Source and Destination from CrawlData

I can get all the ID's with a Select query using Join on the URL=Source and URL = Destination, but don't know how to combine these and then also to get them in a new table Edges with 2 columns:

Upvotes: 1

Views: 1966

Answers (2)

John Woo
John Woo

Reputation: 263703

You can INSERT the records returned by SELECT statement using INSERT INTO...SELECT statement.

INSERT INTO Edges(SourceNode, DestinationNode)
SELECT  b.ID SourceNode,
        c.ID DestinationNode
FROM    CrawlData a
        INNER JOIN Nodes b
            ON a.Source = b.URL
        INNER JOIN Nodes c
            ON a.Destination = c.URL

To further gain more knowledge about joins, kindly visit the link below:

For faster execution, execute the following statements to add INDEX on the columns to avoid FULL TABLE SCAN which could be slow if doing on large RDBMS.

ALTER TABLE Nodes ADD INDEX (URL);

if happens that all values of Source and Destination column are present on Nodes.URL, declare these columns as foreign keys,

ALTER TABLE CrawlData 
      ADD CONSTRAINT cd_fk1 FOREIGN KEY (Source) REFERENCES Nodes(URL)
ALTER TABLE CrawlData 
      ADD CONSTRAINT cd_fk2 FOREIGN KEY (Destination) REFERENCES Nodes(URL)

otherwise, add normal index on them

ALTER TABLE CrawlData ADD INDEX (Source);
ALTER TABLE CrawlData ADD INDEX (Destination);

Upvotes: 2

Darius X.
Darius X.

Reputation: 2937

You can join twice to the Nodes table. Once, use the Source to join to URLK. The next time use the Destination.

Conceptually, it is like using two copies of the Nodes table, each with a different name (say "S" and "D"). You get:

select S.ID As SOURCE_ID, D.ID As DEST_ID
from CrawlData
join Nodes S on Source = S.URL
join Nodes D on Destination = D.URL

Upvotes: 1

Related Questions