user2349115
user2349115

Reputation: 1286

MySql Join with Sum returns wrong output

This is my sql query:

select 
    sum(table1.quantity) as t1q, 
    sum(table1_2.quantity) as t2q, 
    sum(table3.quantity) as t3q, 
    table1.pid as pid
from table1 
    inner join table3 on table1.pid=table3.pid
    inner join table1 table1_2 on table1.pid=table1_2.pid 
where 
    table1.to_id=10 and 
    table3.some_id=10 and 
    table1_2.from_id=10 
group by pid;

Sample Data:

Table1:
quantity, to_id, from_id, pid

6, 10, 999999, 345
4, 888999, 10, 345
3, 888999, 10, 345

If you observer above sql:

There are 2 tables:

Table1 (same table used twice as table1 and table1_2)
Table3

I want to fetch from table1.to_id=10 for calculating t1q and table1(same table).from_id=10 for calculating t2q.

I am getting correct output in some cases but in some cases, t1q is giving value as 12 when it should be 6. In that scenario, its because after joins, there are 2 records of table1_2 and only one record for for table1. So, it is counting table1.quantity twice even though its has only one record after filtering.

Can you please provide correct sql query.

Upvotes: 0

Views: 657

Answers (3)

MatBailie
MatBailie

Reputation: 86715

I'm presupposing that you are correct, that the ONLY problem is when the join on table1_2 causes duplication.

In which case, aggregate that table first, before joining on it.

select 
    sum(table1.quantity) as t1q, 
    sum(table1_2.quantity) as t2q, 
    sum(table3.quantity) as t3q, 
    table1.pid as pid
from
    table1 
inner join
    table3
        on table1.pid=table3.pid
inner join
(
    SELECT
        pid,
        SUM(quantity) AS quantity
    FROM
        table1
    WHERE
        from_id = 10
    GROUP BY
        pid
) 
    table1_2
        on table1.pid=table1_2.pid 
where 
    table1.to_id=10 and
    table3.some_id=10
group by
    pid

If your assumption is incomplete, you may need to do that to each table...

select 
    table1.quantity    as t1q, 
    table1_2.quantity  as t2q, 
    table3.quantity    as t3q, 
    table1.pid         as pid
from
 (
    SELECT
        pid,
        SUM(quantity) AS quantity
    FROM
        table1
    WHERE
        to_id = 10
    GROUP BY
        pid
)
    table1
inner join
(
    SELECT
        pid,
        SUM(quantity) AS quantity
    FROM
        table3
    WHERE
        some_id = 10
    GROUP BY
        pid
)
    table3
        on table1.pid=table3.pid
inner join
(
    SELECT
        pid,
        SUM(quantity) AS quantity
    FROM
        table1
    WHERE
        from_id = 10
    GROUP BY
        pid
) 
    table1_2
        on table1.pid=table1_2.pid

Without and example dataset to test against, there's a great deal of supposition there.

But, as a rule, aggregate before joining, if you're joining on a column that's not unique in that table.

EDIT: Response to comment

SELECT
    COALESCE(table1.t1q, 0)           AS t1q,
    COALESCE(table1.t2q, 0)           AS t2q,
    COALESCE(table3.t3q, 0)           AS t3q,
    COALESCE(table1.pid, table3.pid)  AS pid
FROM
(
    SELECT
        pid,
        SUM(CASE WHEN   to_id = 10 THEN quantity ELSE 0 END) AS t1q,
        SUM(CASE WHEN from_id = 10 THEN quantity ELSE 0 END) AS t2q
    FROM
        table1
    WHERE
            to_id   = 10
        OR  from_id = 10
    GROUP BY
        pid
)
    table1
FULL OUTER JOIN
(
    SELECT
        pid,
        SUM(quantity) AS quantity
    FROM
        table3
    WHERE
        some_id = 10
    GROUP BY
        pid
)
    table3
        ON table1.pid = table3.pid

Or...

SELECT
    SUM(t1q)   AS t1q,
    SUM(t2q)   AS t2q,
    SUM(t3q)   AS t3q,
    pid
FROM
(
    SELECT pid, quantity AS t1q, 0 AS t2q, 0 AS t3q FROM table1 WHERE to_id = 10
    UNION ALL
    SELECT pid, 0              , quantity, 0        FROM table1 WHERE from_id = 10
    UNION ALL
    SELECT pid, 0              , 0       , quantity FROM table3 WHERE some_id = 10
)
    combined
GROUP BY
    pid

Upvotes: 1

Philip Wrage
Philip Wrage

Reputation: 1559

I don't have enough information to determine whether your schema is 'wrong', and that wasn't your original question.

However, with the information you've provided, this is how I would approach solving the original question:

SELECT t1.pid, t1q, t2q, t3q

FROM (SELECT pid, to_id, SUM(quantity) AS t1q FROM table1 GROUP BY pid, to_id) AS t1

INNER JOIN (SELECT pid, from_id, SUM(quantity) AS t2q FROM table1 GROUP BY pid, from_id) AS t2
ON t1.pid = t2.pid AND t1.to_id = t2.from_id

INNER JOIN (SELECT pid, some_id, SUM(quantity) AS t3q FROM table3 GROUP BY pid, some_id) AS t3
ON t1.pid = t3.pid AND t1.to_id = t3.some_id

WHERE t1.to_id = 10

You can then udpate t1.to_id = 10 to be any value you want in future queries, and it may even lend itself to becoming a stored procedure where you pass in the to_id as an argument.

Upvotes: 0

Geoff Griswald
Geoff Griswald

Reputation: 1093

I think the issue comes from trying to use the same table twice in the same calculation, where you have no unique keys.

This is possible, but the SQL is ugly. The way I would approach it would be like this, assuming your "table3", (which you don't specify) is laid out similarly to your "table1":

SELECT a.pid,
       b.t1q,
       c.t2q,
       d.t3q
 FROM (SELECT DISTINCT(pid) FROM table1 AS pid) a
 JOIN (SELECT pid, SUM(quantity) AS t1q FROM table1 WHERE to_id = 10 GROUP BY pid) b ON a.pid = b.pid
 JOIN (SELECT pid, SUM(quantity) AS t2q FROM table1 WHERE from_id = 10 GROUP BY pid) c ON a.pid = b.pid
 JOIN (SELECT pid, SUM(quantity) AS t3q FROM table3 WHERE some_id = 10 GROUP BY pid) d ON a.pid = d.pid

Upvotes: 1

Related Questions