Reputation: 371
In my application the users can create campaigns for sending messages. When the campaign tries to send a message, one of the three things can happen:
To keep track of this, I have the following table:
My problem is that when the application has processed a lot of messages (more than 10 million), the query I use for showing campaign statistics for the user slows down by a considerable margin (~ 15 seconds), even when there are only a few (~ 10) campaigns being displayed for the user.
Here is the query I'm using:
select `campaigns`.*, (select count(*) from `processed_messages`
where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'sent') as `messages_sent`,
(select count(*) from `processed_messages` where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'failed') as `messages_failed`,
(select count(*) from `processed_messages` where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'supressed') as `messages_supressed`
from `campaigns` where `user_id` = 1 and `campaigns`.`deleted_at` is null order by `updated_at` desc;
So my question is: how can I make this query run faster? I believe there should be some way of not having to use sub-queries multiple times but I am not very experienced with MySQL syntax yet.
Upvotes: 0
Views: 43
Reputation: 520958
You should write this as a single join, using conditional aggregation:
SELECT
c.*,
COUNT(CASE WHEN pm.status = 'sent' THEN 1 END) AS messages_sent,
COUNT(CASE WHEN pm.status = 'failed' THEN 1 END) AS messages_failed,
COUNT(CASE WHEN pm.status = 'suppressed' THEN 1 END) AS messages_suppressed
FROM campaigns c
LEFT JOIN processed_messages pm
ON c.id = pm.campaign_id
WHERE
c.user_id = 1 AND
c.deleted_at IS NULL
GROUP BY
c.id
ORDER BY
c.updated_at DESC;
It should be noted that at first glance, doing SELECT c.*
appears to be a violation of the GROUP BY
rules which say that only columns which appear in the GROUP BY
clause can be selected. However, assuming that campaigns.id
is the primary key column, then there is nothing wrong with selecting all columns from this table, provided that we aggregate by the primary key.
Edit:
If the above answer does not run on your MySQL server version, with an error message complaining about only full group by, then use this version:
SELECT c1.*, c2.messages_sent, c2.messages_failed, c2.message_suppressed
FROM campaigns c1
INNER JOIN
(
SELECT
c.id
COUNT(CASE WHEN pm.status = 'sent' THEN 1 END) AS messages_sent,
COUNT(CASE WHEN pm.status = 'failed' THEN 1 END) AS messages_failed,
COUNT(CASE WHEN pm.status = 'suppressed' THEN 1 END) AS messages_suppressed
FROM campaigns c
LEFT JOIN processed_messages pm
ON c.id = pm.campaign_id
WHERE
c.user_id = 1 AND
c.deleted_at IS NULL
GROUP BY
c.id
) c2
ON c1.id = c2.id
ORDER BY
c2.updated_at DESC;
Upvotes: 3