Dmitri
Dmitri

Reputation: 2457

MySQL multiple joins

I'm making a request that looks like this:

SELECT main_table.*, <complicated calculation> AS computed_variable
FROM main_table
LEFT JOIN table1 ON main_table.id = table1.ref_id
LEFT JOIN table2 ON main_table.id = table2.ref_id
LEFT JOIN table3 ON main_table.id = table3.ref_id
GROUP BY main_table.id
HAVING computed_value > 1000
ORDER BY computed_variable ASC

The important part here is complicated calculation, which SUMs (something like this: SUM(table3.value * table3.duration)) over different table3 columns. I should note that it is absolutely NOT possible to throw it away from this SQL, since it is used in ordering and "having" constraints

The problem I've never come across - two (or more) JOINS (table1, table2) make it so that SUM for table3 is incorrect. According to what I know - all the table3 records are being used multiple times -> COUNT(table1) * COUNT(table2) times, to be precise.

Correct me if I'm wrong.

So - the end value (computed_variable) is much greater than it should be.

Question is - how to fix this?

P.S table1 and table2 joins cannot be removed as well. I need them for other important calculations and constrains.

But they do not use any complicated aggregation functions, so it doesn't really matter if their records have duplicated.

Can anything be done about this problem?

P.S For those of you who wanted to see an example: http://sqlfiddle.com/#!2/57333/4

Upvotes: 0

Views: 214

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269443

You have a problem of joining along multiple dimensions, which is multiplying the number of rows. Assuming the complicated calculation is just from table3 (as you have described it in the question), you can pre-aggregate the results:

SELECT m.*, <complicated calculation> AS computed_variable
FROM main_table m LEFT JOIN
     table1
     ON m.id = t1.ref_id LEFT JOIN
     table2
     ON m.id = t2.ref_id LEFT JOIN
     (select t3.ref_id, <complicated calculation> AS computed_variable
      table3 t3
      group by t3.ref_id
     ) t3
     ON m.id = t3.ref_id
GROUP BY m.id
HAVING computed_value > 1000
ORDER BY computed_variable ASC;

You may need to include more tables in the subquery, if they are needed for the calculation.

Upvotes: 1

StefanGordon
StefanGordon

Reputation: 1589

Perhaps you can use a sub-query for the join? So, rather than joining the table, you join another select statement which includes the sum performed prior to the join.

e.g. something like:

LEFT JOIN (select *, -calculation- from table3) table3data on main_table.id = table3data.ref_id

Upvotes: 1

Related Questions