MySQL filter by GROUP BY result

Question

I don't know if this is something I can achieve effectively with sub queries, or how to even build a query for this. I have to extract some knowledge about people no longer using our system. Imagine we have 3 users with user id 1024, 1234, and 5678; and User 1024 and 1234 are using theme A, and 5678 is using theme B:

$ SELECT * FROM user;  | $ SELECT * FROM user_theme;
+------+------+        | +------+-------+
|   id | name |        | | user | theme |
+------+------+        | +------+-------+
| 1024 | John |        | | 1024 |     A |
| 1234 | Jane |        | | 1234 |     A |
| 5678 | Jeff |        | | 5678 |     B |
+------+------+        | +------+-------+

The usage tracking table appears as this:

$ SELECT * FROM user_usage;
+----+------+---------------------+------+
| id | user | date                | uses |
+----+------+---------------------+------+
|  1 | 1234 | 2014-08-02 00:00:00 |    5 |
|  2 | 1234 | 2014-08-03 00:00:00 |    5 |
|  3 | 1234 | 2014-08-04 00:00:00 |    3 |
|  4 | 1234 | 2014-08-05 00:00:00 |    6 |
|  5 | 1024 | 2014-08-02 00:00:00 |    8 |
|  6 | 1024 | 2014-08-03 00:00:00 |    7 |
|  7 | 1024 | 2014-08-04 00:00:00 |    4 |
|  8 | 1024 | 2014-08-05 00:00:00 |    6 |
|  9 | 1024 | 2014-09-02 00:00:00 |    1 |
| 10 | 1024 | 2014-09-03 00:00:00 |    2 |
| 11 | 1024 | 2014-09-04 00:00:00 |    3 |
| 12 | 1024 | 2014-09-05 00:00:00 |    4 |
| 13 | 5678 | 2014-08-02 00:00:00 |    8 |
| 14 | 5678 | 2014-08-03 00:00:00 |    7 |
| 15 | 5678 | 2014-08-04 00:00:00 |    4 |
| 16 | 5678 | 2014-08-05 00:00:00 |    6 |
| 17 | 5678 | 2014-09-02 00:00:00 |    1 |
| 18 | 5678 | 2014-09-03 00:00:00 |    2 |
| 19 | 5678 | 2014-09-04 00:00:00 |    3 |
| 20 | 5678 | 2014-09-05 00:00:00 |    4 |
+----+------+---------------------+------+

I want to find out the break down to see how many usage have dropped from our system in 2014-09 (aka: have usage data in 2014-08, but no longer in 2014-09), grouped by the theme. So I want to write something like:

SELECT
    user_theme.theme,
    SUM(user_usage.users) 'uses lost'
FROM
    user_theme
    LEFT JOIN user_usage 
        ON user_theme.user = user_usage.user
WHERE
    ...
GROUP BY
    user_theme.theme
# HAVING ...?

And get result such as:

+-------+-----------+
| theme | uses lost |
+-------+-----------+
|     A |        19 |
|     B |         0 |
+-------+-----------+

Where the 19 comes from SUM(uses) for WHERE user = 1234 AND YEAR(date) = 2014 AND MONTH(date) = 8. I don't know I care about the SUM(uses) from user = 1234 in advance, because I only know I need to include user 1234 in the SUM(uses)'s WHERE clause because SUM(uses) for WHERE user = 1234 AND YEAR(date) = 2014 AND MONTH(date) = 9 is 0.

There's actually a lot of users, and a handful of themes (around 20K users, and about 10 themes), so ideally, I think I'd like to avoid doing the filtering in code as opposed to directly in the database. Is there a way to do this effectively in MySQL using raw SQL queries?

Iłya Bursov · Accepted Answer

Here is query which compares current month with previous one:

set @current_month = now();
set @previous_month = date_sub(@current_month, interval 1 month);

set @current_month = concat(year(@current_month), month(@current_month));
set @previous_month = concat(year(@previous_month), month(@previous_month));

select a.`theme`, sum(ifnull(b.uses_lost,0)) as uses_lost
from
`user_theme` as a
left outer join
(
select `user`, sum(uses) as uses_lost
from `user_usage`
where concat(year(`date`), month(`date`)) = @previous_month
  and `user` not in (
    select `user`
    from `user_usage`
    where concat(year(`date`), month(`date`)) = @current_month)
group by `user`
) as b
on (a.`user`=b.`user`)
group by a.`theme`;

fiddle for play

main idea is to find all users who used system during last month and has no rows during current month

MySQL filter by GROUP BY result

Answers (1)

Related Questions