Reputation: 5241
I'm working on a project for managing submissions of Key Performance Indicators (KPIs) for onsite projects. Each submission has Performance Measures (PMs) attached to it, these are decided when the submission is enter on site.
The PMs are given a score by the user depending on the level of completion. I am now working on producing a report using all this data. I have a query that works but it is very large and cumbersome. I'm trying to find a way to streamline it.
I have a query that produces a list of KPIs. With this list I then apply some sub queries and simple maths to produce a 'score' for that period.
There are 13 periods. The scores are calculated based on the performance measure socres from completed submissions for the current year. There are two types of submissions 'panel' and 'programme'.
A score needs to be calculated for both submission types for each period. The average of these scores are then taken and the final score is calculated.
The final score is called the pain/gain indicator. we take the calculated score, divide it by 100 and then multiply by 0.3. If the number is less than one then we subtract one from pain/gain, respectively if the number is more than 0 we add 1 to it.
Period 1 is different as we only need to calculate the programme submissions, see below:
PERIOD 1
(
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016' //current year
AND sub.state = '2' //state 2 is complete
AND sub.period = '1'
AND pm.kpi_id = kpi.id
) AS p1
The other periods however need the score based on both submission types:
PERIOD 2 - 12
CASE
WHEN
(
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
), 0) +
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
), 0)/2) / 100 * 0.3 < 0 THEN
1- ROUND((
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
), 0) +
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
), 0)/2) / 100 * 0.3, 2)
WHEN
(
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
), 0) +
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
), 0)/2) / 100 * 0.3 > 0 THEN
1+ ROUND((
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
), 0) +
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
), 0)/2) / 100 * 0.3, 2)
END
END AS p2
This is currently being repeated for the remaining periods. The data I get back is fine but the query is HUGE and ugly. On top of that I want a column at the end of the results that creates an average score for each KPI over all 13 periods. I know I can't reference a column I created by it's alias so currently I'd have to repeat the subqueries AGAIN to get this score.
Is the anyway I can make this query more efficient or loop through periods 2 - 13?
The full query with the results is below:
FULL QUERY UP TO PERIOD 2
SELECT
kra.id AS id,
kra.kra_name AS name,
kra.panel_weighting AS panel_weighting,
kra.programme_weighting AS programme_weighting,
kpi.id AS id,
kpi.kpi_name AS name,
kpi.panel_weighting AS panel_weighting,
kpi.programme_weighting AS programme_weighting,
(
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '1'
AND pm.kpi_id = kpi.id
) AS p1,
CASE
WHEN (
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
) IS NULL
AND
(
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
) IS NULL
THEN NULL
ELSE
CASE
WHEN
(
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
), 0) +
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
), 0)/2) / 100 * 0.3 < 0 THEN
1- ROUND((
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
), 0) +
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
), 0)/2) / 100 * 0.3, 2)
WHEN
(
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
), 0) +
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
), 0)/2) / 100 * 0.3 > 0 THEN
1+ ROUND((
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.period = '2'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.panel = 'panel'
AND pm.kpi_id = kpi.id
), 0) +
IFNULL((
SELECT
ROUND(AVG(sp.progress), 2)
FROM `submissions` AS sub
LEFT JOIN `sub_performancemeasures` sp ON sp.sub_id = sub.id
LEFT JOIN `performance_measures` pm ON pm.id = sp.pm_id
WHERE sub.programme = 'programme'
AND sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.period = '2'
AND pm.kpi_id = kpi.id
), 0)/2) / 100 * 0.3, 2)
END
END AS p2
FROM `kpis` kpi
LEFT JOIN `key_reporting_areas` kra ON kra.id = kpi.kra_id
GROUP BY kpi.id
Upvotes: 0
Views: 53
Reputation: 48139
Since all your underlying sub-queries are based on the corresponding "kpi.id", which is associated with your "pm.kpi_id", that left-join has now pulled everything into an implied INNER join via the WHERE clause. Is that what you expected?
Now, how to possibly streamline it. I would do one big query of each of the period/states into a single "pre-query", then join it ONCE to your "kpi" table based on the ID and pull whatever final values. This is just a SAMPLE, but SHOULD be a good enough guide to what I am referring.
For indexing, I would start by ensuring your "submissions" table has an index on ( submission_year, state, period, programme ) to help optimization.
SELECT
kra.id AS id,
kra.kra_name AS name,
kra.panel_weighting AS panel_weighting,
kra.programme_weighting AS programme_weighting,
kpi.id AS id,
kpi.kpi_name AS name,
kpi.panel_weighting AS panel_weighting,
kpi.programme_weighting AS programme_weighting,
PreAgg.AvgProgrammePer1,
PreAgg.AvgPanelPer1,
PreAgg.AdjFactorPer1,
CASE when PreAgg.AdjFactorPer1 < 0 then 1 - PreAgg.AdjFactorPer1
when PreAgg.AdjFactorPer1 > 0 then 1 + PreAgg.AdjFactorPer1
END as FinalAdjPer1,
PreAgg.AvgProgrammePer2,
PreAgg.AvgPanelPer2,
PreAgg.AdjFactorPer2,
CASE when PreAgg.AdjFactorPer2 < 0 then 1 - PreAgg.AdjFactorPer2
when PreAgg.AdjFactorPer2 > 0 then 1 + PreAgg.AdjFactorPer2
END as FinalAdjPer2,
PreAgg.AvgProgrammePer3,
PreAgg.AvgPanelPer3,
PreAgg.AdjFactorPer3,
CASE when PreAgg.AdjFactorPer3 < 0 then 1 - PreAgg.AdjFactorPer3
when PreAgg.AdjFactorPer3 > 0 then 1 + PreAgg.AdjFactorPer3
END as FinalAdjPer3
from
kpis kpi
LEFT JOIN
( select
KPI_Period.kpi_id,
CASE when KPI_Period.Period = 1
then AvgProgramme END as AvgProgrammePer1,
CASE when KPI_Period.Period = 1
then AvgPanel END as AvgPanelPer1,
CASE when KPI_Period.Period = 1
then (( coalesce( AvgProgramme, 0 )
+ coalesce( AvgPanel, 0 )) / 2 ) / 100 * .3 as AdjFactorPer1,
CASE when KPI_Period.Period = 2
then AvgProgramme END as AvgProgrammePer2,
CASE when KPI_Period.Period = 2
then AvgPanel END as AvgPanelPer2,
CASE when KPI_Period.Period = 2
then (( coalesce( AvgProgramme, 0 )
+ coalesce( AvgPanel, 0 )) / 2 ) / 100 * .3 as AdjFactorPer2,
CASE when KPI_Period.Period = 3
then AvgProgramme END as AvgProgrammePer3,
CASE when KPI_Period.Period = 3
then AvgPanel END as AvgPanelPer3,
CASE when KPI_Period.Period = 3
then (( coalesce( AvgProgramme, 0 )
+ coalesce( AvgPanel, 0 )) / 2 ) / 100 * .3 as AdjFactorPer3
from
( SELECT
pm.kpi_id,
sub.period,
AVG( CASE when sub.programme = 'programme' then sp.progress end ) as AvgProgramme,
AVG( CASE when sub.programme = 'panel' then sp.progress end ) as AvgPanel,
FROM
submissions AS sub
LEFT JOIN sub_performancemeasures sp
ON sub.id = sp.sub_id
LEFT JOIN performance_measures pm
ON sp.pm_id = pm.id
WHERE
sub.submission_year = '2015/2016'
AND sub.state = '2'
AND sub.programme IN ( 'programme', 'panel' )
GROUP BY
pm.kpi_id,
sub.period ) as KPI_Period
group by
KPI_Period.kpi_id ) PreAgg
ON kpi.id = PreAgg.kpi_id
Now, to explain what I am attempting here. The inner-most query is going directly to your underlying tables for submission progress activity based on the year 2015/2016, the state = 2 and the programme as either programme or panel. I am grouping the averages based on each PERIOD. Yes, you can adjust for the ROUNDING of the avg, but I did the average. So, the end of this phase, I would have something like...
KPI_ID Period AvgProgramme AvgPanel
1 1 prog1.blah panel1.blah
1 2 prog2.blah panel2.blah
1 3 prog3.blah panel3.blah
As the rest of the year progresses, so too will the records for period 4-12(13 if four-week periods vs monthly periods). This result is the "KPI_Period" alias subselect select.
Now, that this has been simplified for all possible periods as like a "year-to-date" possibility, I am building a cross-tabbing each period into columns. Now, this part will significantly simplify your IFNULL() conditions. COALESCE() will take a value, and if null, use the second parameter as the value to apply whatever. So, the next level up brings us the
CASE when KPI_Period.Period = 1
then AvgProgramme END as AvgProgrammePer1,
CASE when KPI_Period.Period = 1
then AvgPanel END as AvgPanelPer1,
CASE when KPI_Period.Period = 1
then (( coalesce( AvgProgramme, 0 )
+ coalesce( AvgPanel, 0 )) / 2 ) / 100 * .3 as AdjFactorPer1,
Notice the case/when only cares about period 1 (or 2, or 3 respectively). So, for Period = 1, I would be getting the AvgProgramme value of "prog1.blah" and "panel1.blah" which could be null, or an actual value. So they are what they are, taken as is, but assigned a final name of "AvgProgrammePer1" and "AvgPanelPer1" respectively.
Now, your crazy computation. I am also pre-rolling up this computation ONCE via applying the COALESCE.
( ( coalesce( AvgProgramme, 0 ) + coalesce( AvgPanel, 0 ))
/ 2 ) / 100 * .3 as AdjFactorPer1,
So, if the AvgProgramme OR AvgPanel is a null value, it will be converted to 0 for the computation. So you will never have a divide by 0 failure, but 0/2 is zero, then 0 / 100 = 0... * .3 = 0. So, if both values are null, your final adjustment factor is zero and thus no +/- 1 applied in the final set. This is applied for each "period" giving a result something like...
(abbreviated columns and sample data)
KP AvgProg1 AvgPan1 Adj1 AvgProg2 AvgPan2 Adj2 AvgProg3 AvgPan3 Adj3
1 prog1blah pan1blah avg1 prog2blah pan2blah avg2 prog3blah pan3blah avg3
So now, for EACH KPI_ID, you have 1 row with all the programme, panel and adjustment computation factors applied, no Cartesian results, so no duplicate accounting should occur. This is the "PreAgg" (pre-aggregate) final result that will be joined to your KPI table.
Now, the top-most level of all the fields. You can (as I sampled), grab all the columns, but here is the simplified part of the adjustment factor. Since it is already pre-computed at the Pre-Agg level, we can now just apply the factoring
PreAgg.AvgProgrammePer1,
PreAgg.AvgPanelPer1,
PreAgg.AdjFactorPer1,
CASE when PreAgg.AdjFactorPer1 < 0 then 1 - PreAgg.AdjFactorPer1
when PreAgg.AdjFactorPer1 > 0 then 1 + PreAgg.AdjFactorPer1
END as FinalAdjPer1,
You can always remove the redundant columns you don't care to have in the final result, but at least CAN have them to confirm computations along the way. And this query applies for 3 full periods... just needs the additional case/when blockes for periods 4-12, no additional subquerying constructs to bang your head with.
Upvotes: 1