Grumpy Civet
Grumpy Civet

Reputation: 381

Performance/time joining multiple tables

I have three data tables that have the same length (~50000), different columns (<500 each), and share a common "id" column.

They look like:

table A

   id    A1    A2   ...      
   1    xxx   xxx   ...
   2    xxx   xxx   ...
  ...   ...   ...   ...  
   n    xxx   xxx   ...

table B

   id    B1    B2   ...  
   1    xxx   xxx   ...
   2    xxx   xxx   ...
  ...   ...   ...   ... 
   n    xxx   xxx   ...

table C

   id    C1    C2   ...  
   1    xxx   xxx   ...
   2    xxx   xxx   ...
  ...   ...   ...   ... 
   n    xxx   xxx   ...

I was trying to join them together using

CREATE TABLE my_table
SELECT *
FROM table_A
LEFT OUTER JOIN table_B
ON table_A.id = table_B.id
LEFT OUTER JOIN table_C
ON table_A.id = table_C.id;

and it's been taking hours.

However, when I do it by two separate steps like

CREATE TABLE my_table_0
SELECT *
FROM table_A
LEFT OUTER JOIN table_B
ON table_A.id = table_B.id;

CREATE TABLE my_table_1
SELECT *
FROM my_table_0
LEFT OUTER JOIN table_C
ON my_table_0.id = table_C.id;

Each "step" only takes less than 5 minutes.

Does anyone know whether this is normal and what's causing it? I wonder if there is a faster way I can join three tables altogether without creating intermediary tables.

Upvotes: 1

Views: 769

Answers (3)

Adrian Ang
Adrian Ang

Reputation: 580

Depending on the SQL software, not all would understand LEFT OUTER JOIN.

Try to use either OUTER JOIN or LEFT JOIN.

Upvotes: 0

blackraven
blackraven

Reputation: 5597

When doing a UNION of 2 tables, you should use FULL OUTER JOIN. Would you try to execute below codes and let me know if it works:

CREATE TABLE my_table
SELECT *
FROM table_A
FULL OUTER JOIN table_B
ON table_A.id = table_B.id
FULL OUTER JOIN table_C
ON table_A.id = table_C.id;

And if you would like to join the 3 tables while maintaining the length of the first table (same number of rows), you should use LEFT JOIN:

CREATE TABLE my_table
SELECT *
FROM table_A
LEFT JOIN table_B
ON table_A.id = table_B.id
LEFT JOIN table_C
ON table_A.id = table_C.id;

Upvotes: 0

Timo Reymann
Timo Reymann

Reputation: 885

Sometimes (My)SQL can be strange.

What maybe already could help in your case is using an inner join, if i understand this correctly all tables share the id column so this should be already a bit faster.

To get a better understanding about what is going on when you execute your query you can use the EXPLAIN keyword, there are some articles using it and understanding the output.

For example this is a good read: https://www.exoscale.com/syslog/explaining-mysql-queries/

Upvotes: 1

Related Questions