Reputation: 841
I have this SQL in my code and the repetition of the sub-queries has me worried that this isn't as efficiently written as it could be.
My database has a table for features
, responses
, and participants
. Participants give a rating (one of D,P,B,I,R,Q) for each feature.
SELECT f.id, f.name,
(SELECT COUNT(r.id) FROM responses r LEFT JOIN participants p ON r.id_participant = p.id WHERE p.is_ignored IS NULL AND r.is_deleted IS NULL AND r.id_feature = f.id AND p.category LIKE :p_category) AS nr_r,
(SELECT SUM(r.is_D) FROM responses r LEFT JOIN participants p ON r.id_participant = p.id WHERE p.is_ignored IS NULL AND r.is_deleted IS NULL AND r.id_feature = f.id AND p.category LIKE :p_category) AS is_D,
(SELECT SUM(r.is_P) FROM responses r LEFT JOIN participants p ON r.id_participant = p.id WHERE p.is_ignored IS NULL AND r.is_deleted IS NULL AND r.id_feature = f.id AND p.category LIKE :p_category) AS is_P,
(SELECT SUM(r.is_B) FROM responses r LEFT JOIN participants p ON r.id_participant = p.id WHERE p.is_ignored IS NULL AND r.is_deleted IS NULL AND r.id_feature = f.id AND p.category LIKE :p_category) AS is_B,
(SELECT SUM(r.is_I) FROM responses r LEFT JOIN participants p ON r.id_participant = p.id WHERE p.is_ignored IS NULL AND r.is_deleted IS NULL AND r.id_feature = f.id AND p.category LIKE :p_category) AS is_I,
(SELECT SUM(r.is_R) FROM responses r LEFT JOIN participants p ON r.id_participant = p.id WHERE p.is_ignored IS NULL AND r.is_deleted IS NULL AND r.id_feature = f.id AND p.category LIKE :p_category) AS is_R,
(SELECT SUM(r.is_Q) FROM responses r LEFT JOIN participants p ON r.id_participant = p.id WHERE p.is_ignored IS NULL AND r.is_deleted IS NULL AND r.id_feature = f.id AND p.category LIKE :p_category) AS is_Q,
(SELECT (is_D + is_P)/(is_D + is_P + is_B + is_I)) as yay,
(SELECT (is_P + is_B)/(is_D + is_P + is_B + is_I)) as boo
FROM features f
WHERE f.is_deleted IS NULL AND f.id_survey=:id_project
ORDER BY f.id ASC;
The output from that query is a table of totals, looking something like
577 App registration 989 36 21 38 201 42 6 0.1926 0.1993
578 Login PIN 989 279 118 137 394 41 20 0.4278 0.2748
579 Manage all services 989 287 207 127 331 23 14 0.5189 0.3508
580 Rewards 989 344 157 64 386 19 19 0.5268 0.2324
581 Offers 989 226 93 37 542 72 19 0.3552 0.1448
The query currently takes 4.4387 seconds to run for a dataset with 989 participants, 14 features (and thus 13,846 individual ratings). Pretty sure that stinks.
Is there a more efficient way of writing that query?
Is this something that GROUP BY would be useful with?
Upvotes: 0
Views: 69
Reputation: 7171
If I understand your query right you have the same predicates in every sub-query and then you can easily replace all sub-selects with a join and do a group by. The calculations on the aggregates (yay and boo) are calculated at the outer level:
SELECT id, name, nr_r, is_D, ...
, (is_D + is_P)/(is_D + is_P + is_B + is_I) yay
, (is_P + is_B)/(is_D + is_P + is_B + is_I) boo
FROM (
SELECT f.id, f.name
, count(r.id) AS nr_r
, sum(r.is_D) as is_D
, ...
FROM features f
LEFT JOIN responses r
ON r.id_feature = f.id
AND r.is_deleted IS NULL
LEFT JOIN participants p
ON r.id_participant = p.id
AND p.is_ignored IS NULL
AND p.category LIKE :p_category
WHERE f.is_deleted IS NULL
AND f.id_survey=:id_project
GROUP BY f.id, f.name
) AS T
ORDER BY ...
Upvotes: 1