Reputation: 40932
Alright, I totally hate myself for asking this; but I'm looking for another way to do this query, something that might be faster as well as more elegant (this looks like crap). Tell me what you think please:
SELECT TRUNCATE(SUM(sub.Total),3) AS GrpTotal, sub.ActualDate,
TRUNCATE(SUM(sub.BonusAmt),3) AS GrpBonusAmt, sub.UID,
sub.CUSTID, YEAR(MIN(sub.ActualDate)) AS Year, pusers.username
FROM ( SELECT a.UID, a.ActualDate, 'Global Report' AS Report,
SUM(a.totalpayment) AS Total, a.CUSTID,
((SUM(a.totalpayment)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmt
FROM `globalreport` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Amex Residuals' AS Report,
SUM(a.payment) AS Total, a.CUSTID,
((SUM(a.payment)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `amexresiduals` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Compliance Fee' AS Report,
SUM(a.profit) AS Total, a.CUSTID,
((SUM(a.profit)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `compliancefee` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Checks On Demand' AS Report,
SUM(a.myprofit) AS Total, a.CUSTID,
((SUM(a.myprofit)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `geticheck` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Gift Cards on Demand' AS Report,
SUM(a.payment) AS Total, a.CUSTID,
((SUM(a.payment)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `gcod` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Global Check' AS Report,
SUM(a.myprofit) AS Total, a.CUSTID,
((SUM(a.myprofit)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `globalcheck` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Bonus True Up' AS Report,
SUM(a.finalpayment) AS Total, a.CUSTID,
((SUM(a.finalpayment)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `bonustrueup` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Bonus Take Back - Did Not Activate' AS Report,
SUM(a.amount) AS Total, a.CUSTID,
((SUM(a.amount)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `bonusadjnosetup` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Bonus Take Back - Closed Less Than 6 Months' AS Report,
SUM(a.amount) AS Total, a.CUSTID,
((SUM(a.amount)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `bonusadjclosed6mo` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Month End Fee Rejects' AS Report,
SUM(a.amount) AS Total, a.CUSTID, SUM(a.amount) AS BonusAmnt
FROM `merchantloss` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Direct ACH Debits and Credits' AS Report,
SUM(a.amount*-1) AS Total, a.CUSTID, SUM(a.amount*-1) AS BonusAmnt
FROM `dirachdebcred` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT a.UID, a.ActualDate, 'Merchant Adjustments' AS Report,
SUM(a.amount) AS Total, a.CUSTID,
((SUM(a.amount)*IFNULL((u.retention_percent/100),1))+IFNULL(u.bonus_amount,0)) AS BonusAmnt
FROM `merchantadj` a
LEFT JOIN `users` u ON u.uid = a.UID
WHERE true AND a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
) sub
LEFT JOIN `pending_users` pusers ON pusers.UID = sub.UID
WHERE sub.CUSTID = 1020
AND sub.`UID` NOT IN
( SELECT `UID`
FROM `users`
WHERE `is_admin` AND `company_id` = sub.`CUSTID`)
GROUP BY sub.ActualDate, sub.UID, sub.Report
ORDER BY sub.ActualDate ASC
Clearly, this is a lengthy query. I'm just not sure it has to be. Essentially, I'm gathering and summing a different column in each of the unioned tables and grouping by that amount at the very end so I can get a total sum from all tables.
Upvotes: 0
Views: 88
Reputation: 108450
UPDATE
As if my previous answer wasn't long enough ...
I'd also move the JOIN to the users table from the inner queries to the outer queries, it looks like it's identical in every case.
But, we need that for the calculation of BonusAmnt (which appears to always be based on whatever Total
is; the calculation is the same for all of them except for two. So, I also moved the BonusAmnt calculation to the outer query as well, and just adding a check for which calculation to do, using a CASE expression.
I may have missed something.
But here's how I'd write the query.
I'll show it in two pieces, with the query for the sub
inline view separate from the outer query.
SELECT TRUNCATE(sub.Total,3) AS GrpTotal
, sub.ActualDate
, CASE
WHEN sub.Report IN ('Month End Fee Rejects','Direct ACH Debits and Credits') THEN
TRUNCATE(sub.Total,3)
ELSE
TRUNCATE(sub.Total*IFNULL((u.retention_percent/100),1)+IFNULL(u.bonus_amount,0)),3)
END AS GrpBonusAmt
, sub.UID
, sub.CUSTID
, YEAR(sub.ActualDate) AS Year
, pusers.username
FROM (
-- query to produce sub goes here
) sub
LEFT
JOIN `users` u
ON u.uid = sub.UID
LEFT
JOIN `pending_users` pusers
ON pusers.UID = sub.UID
LEFT
JOIN `users` n
ON n.UID = sub.CUSTID
AND n.is_admin
WHERE n.UID IS NULL
ORDER BY sub.ActualDate, sub.UID, sub.Report
And here is the second part. This is the query that gets stuck into the middle of the part above, as an inline view. This is the query that that produces sub
row source for the outer query:
SELECT 'Global Report' AS Report
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.totalpayment) AS Total
FROM `globalreport` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Amex Residuals'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.payment) AS Total
FROM `amexresiduals` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Compliance Fee'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.profit) AS Total
FROM `compliancefee` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Checks On Demand'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.myprofit) AS Total
FROM `geticheck` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Gift Cards on Demand'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.payment) AS Total
FROM `gcod` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Global Check' AS Report
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.myprofit) AS Total
FROM `globalcheck` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Bonus True Up'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.finalpayment) AS Total
FROM `bonustrueup` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Bonus Take Back - Did Not Activate'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.amount) AS Total
FROM `bonusadjnosetup` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Bonus Take Back - Closed Less Than 6 Months'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.amount) AS Total
FROM `bonusadjclosed6mo` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Month End Fee Rejects'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.amount) AS Total
FROM `merchantloss` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Direct ACH Debits and Credits'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.amount*-1) AS Total
FROM `dirachdebcred` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
UNION ALL
SELECT 'Merchant Adjustments'
, a.UID, a.ActualDate, a.CUSTID
, SUM(a.amount) AS Total
FROM `merchantadj` a
WHERE a.CUSTID = 1020
GROUP BY a.ActualDate, a.UID
Upvotes: 1
Reputation: 108450
Since you're getting results from several different tables, your query actually looks really good. I agree that the length of the SQL text is a bit daunting.
The overall approach looks like it's going to be the most efficient way to get the specified result. I'm assuming that the inner queries against the individual tables are collapsing a lot of rows.
In terms of performance, about the only thing I would change is that NOT IN
predicate, since that's going to cause the subquery to be executed for each row.
You could get an equivalent result using an anti-join pattern, assuming that UID
is a primary key in users
table, or, that it's at least guaranteed to be NOT NULL. (I'm assuming that the subquery doesn't return NULL; if it did, the NOT IN predicate wouldn't return TRUE, and you wouldn't get any rows back at all.)
So this:
WHERE sub.CUSTID = 1020
AND sub.`UID` NOT IN
( SELECT `UID`
FROM `users`
WHERE `is_admin` AND `company_id` = sub.`CUSTID`)
can be replaced with an equivalent but (usually) more efficient:
LEFT
JOIN `users` n
ON n.is_admin
AND n.company_id = sub.`CUSTID`
AND n.`UID` = sub.`UID`
WHERE n.`UID` IS NULL
AND sub.CUSTID = 1020
The anti-join pattern is looking for a matching row, and then excluding any rows that did have a match, so what we're left with is just the rows that didn't match.
Actually, there's no need check that sub.CUSTID = 1020
, that can be omitted. We're already guaranteed it's going to be 1020 by each of the queries in the sub
inline view.
In the SELECT list of the outer query, I think this MIN aggregate adds some confusion:
YEAR(MIN(sub.ActualDate))
It just looks unnecessary, since the query is doing a GROUP BY sub.ActualDate
, and since the YEAR
function is deterministic (i.e. the values returned by YEAR(foo)
are going to be equal for equal values of foo
. I'd replace that with just:
YEAR(sub.ActualDate)
Actually, I don't think it's necessary to perform a GROUP BY
operation at all on the outer query.
The value returned for the Report
column of sub
are distinct to each of the UNION ALL
queries. And each of those queries has already performed a GROUP BY UID, ActualDate
. So, the (UID,ActualDate,Report)
tuple from sub
is already guaranteed to be unique.
The only question there is whether UID
is unique in the pending_users
table. If it's not, then the totals the query is returning are getting doubled, tripled, etc., and just picking the username
value from one matching row. But I don't think that the query is doing that, I strongly suspect that UID
is unique in pending_users
. (There aren't any columns from pending_users
included in the outermost GROUP BY.) Aside from matching multiple rows from pending_users
...
The query has already guaranteed that there aren't any rows to be collapsed by that GROUP BY on the outermost query. It looks like all that GROUP BY is getting you is a sort operation. That could be replaced with an ORDER BY
.
That also means that the SUM
aggregates (in the SELECT list of the outermost query) aren't necessary. For example, this expression:
TRUNCATE(SUM(sub.Total),3)
could be replaced with:
TRUNCATE(sub.Total,3)
and return an equivalent result.
In terms of performance... having suitable indexes available is important. in the EXPLAIN, we'd rather not see Using filesort
for the GROUP BY operations of those inner queries. We'd much rather see Using index
.
For those inner queries, ideally we'd see an indexes like this:
... ON merchantloss (CUSTID, ActualDate, UID, amount)
There's an equality predicate on CUSTID, and there's a GROUP BY operation on ActualDate,UID. Having the columns in this order (almost) guarantees that MySQL will use the index to perform the GROUP BY and avoid a Using filesort
operation. Including the other columns referenced in the query makes it a covering index, meaning the query can be satisfied entirely from the index without any need to visit the pages in the underlying table.
One final note... it might be just a tiny bit more efficient to do that is_admin
exclusion (using the anti-join pattern) on each of the individual inner queries, if that limits the number rows being returned from sub
. But I suspect it won't make much difference, and it might even be slower... I'd leave it on the outermost query, if just for the sake of avoiding repeating that multiple times.
Upvotes: 1