Edward Ko
Edward Ko

Reputation: 55

Left join - how to be more efficient?

I wrote a query that is difficult to read and Im afraid the efficiency might be bad. Can this query be written more efficiently in terms of matching consumerid's from each table to consumerdata.consumerid?

select 

consumerdata.consumerid, 
signupdate, 
city, 
state, 
year(DOB), 
topaffiliate, 
activestatus, 
lastuseddate,

(select sum(achload.transactionamount) from achload where achload.consumerid = consumerdata.consumerid) as Total_ACH,
(select sum(billpay.transactionamount) from billpay where billpay.consumerid = consumerdata.consumerid) as Bill_Pay,
(select sum(recharge.transactionamount) from recharge where recharge.consumerid = consumerdata.consumerid) as Revenue,
(select count(cash.consumerid) from Cash where cash.consumerid = consumerdata.consumerid) as Cash__TXNs,
(select sum(moneytransfer.transactionamount) from moneytransfer where moneytransfer.consumerid = consumerdata.consumerid) as Transfer,
(select sum(moneytransfer.commissionfeeamount) from moneytransfer where moneytransfer.consumerid = tbl_accounts_consumerdata.consumerid) as commission_fee,
(select count(interchangetransactions.transactiondescription) from interchangetransactions where interchangetransactions.transactiondescription = "Withdrawal" and interchangetransactions.consumerid = consumerdata.consumerid) as Withdrawals,
(select count(interchangetransactions.responsecode) from interchangetransactions where interchangetransactions.responsecode in (01) and interchangetransactions.consumerid = consumerdata.consumerid and interchangetransactions.transactiondescription = "Withdrawal") as Code_01,
(select count(interchangetransactions.responsecode) from interchangetransactions where interchangetransactions.responsecode in (02) and interchangetransactions.consumerid = consumerdata.consumerid and interchangetransactions.transactiondescription = "Withdrawal" ) as Code_02,
(select count(interchangetransactions.transactiondescription) from interchangetransactions where interchangetransactions.transactiondescription IN ("2 Cash Advance", "1 Cash Advance") and interchangetransactions.consumerid = consumerdata.consumerid) as Cash_Advance_Count,
(select count(interchangetransactions.transactiondescription) from interchangetransactions where interchangetransactions.transactiondescription IN ("Balance") and interchangetransactions.consumerid = consumerdata.consumerid) as Inquiry_Count,
(select sum(interchangetransactions.transactionamount) from interchangetransactions where interchangetransactions.transactiontypecode = 195 and interchangetransactions.consumerid = consumerdata.consumerid)  as Intnl_TXN_Sum,
(select count(interchangetransactions.transactionamount) from interchangetransactions where interchangetransactions.transactiontypecode = 195 and interchangetransactions.consumerid = consumerdata.consumerid)  as Intnl_TXN_Count


from
consumerdata

left join achload
on achload.consumerid=consumerdata.consumerid

left join  billpay
on billpay.consumerid = consumerdata.consumerid

left join recharge
on recharge.consumerid = consumerdata.consumerid

left join cash
on cash.consumerid = consumerdata.consumerid

left join moneytransfer
on moneytransfer.consumerid = consumerdata.consumerid

left join interchangetransactions
on interchangetransactions.consumerid = consumerdata.consumerid

where consumerdata.signupdate between date(20120101) and curdate()
group by consumerdata.consumerid order by signupdate asc;

Upvotes: 0

Views: 42

Answers (1)

Barmar
Barmar

Reputation: 780974

Since you're not selecting anything from any of the transaction tables, a simple improvement would be to get rid of all the LEFT JOIN clauses. They're not being used for anything, since you're getting the totals from the correlated subqueries. This is causing the database to create a huge cross product between all the tables, then discard it when you do the GROUP BY at the bottom.

Another way to do it is to join with subqueries that calculate the grouped totals of each transaction table, instead of correlated subqueries.

select 
    c.consumerid, signupdate, state, year(DOB), topaffiliate, activestatus, lastuseddate,
    IFNULL(Total_ACH, 0) AS Total_ACH, IFNULL(Bill_Pay, 0) AS Bill_Pay, IFNULL(Revenue, 0) AS Revenue, 
    IFNULL(Cash__TXNs, 0) AS Cash__TXNs, ...
FROM consumerdata AS c
LEFT JOIN (
    SELECT consumerid, SUM(transactionamount) AS Total_ACH
    FROM achload
    GROUP BY consumerid) AS a ON a.consumerid = c.consumerid
LEFT JOIN (
    SELECT consumerid, SUM(transactionamount) AS Bill_Pay
    FROM billpay
    GROUP BY consumerid) AS b ON b.consumerid = c.consumerid
LEFT JOIN (
    SELECT consumerid, SUM(transactionamount) AS Revenue
    FROM recharge
    GROUP BY consumerid) AS r ON r.consumerid = c.consumerid
LEFT JOIN (
    SELECT consumerid, COUNT(*) AS Cash__TXNs
    FROM Cash
    GROUP BY consumerid) AS cash ON cash.consumerid = c.consumerid
...
WHERE c.signupdate between DATE(20120101) and CURDATE()
ORDER BY signupdate ASC

I'm not sure which way will actually be faster -- it would probably depend on how much data there is in the transaction tables for consumers whose signup dates are outside the range you specify. Try both and see which is better.

Upvotes: 1

Related Questions