Reputation: 199
I don't know if this is something I can achieve effectively with sub queries, or how to even build a query for this. I have to extract some knowledge about people no longer using our system. Imagine we have 3 users with user id 1024, 1234, and 5678; and User 1024 and 1234 are using theme A, and 5678 is using theme B:
$ SELECT * FROM user; | $ SELECT * FROM user_theme;
+------+------+ | +------+-------+
| id | name | | | user | theme |
+------+------+ | +------+-------+
| 1024 | John | | | 1024 | A |
| 1234 | Jane | | | 1234 | A |
| 5678 | Jeff | | | 5678 | B |
+------+------+ | +------+-------+
The usage tracking table appears as this:
$ SELECT * FROM user_usage;
+----+------+---------------------+------+
| id | user | date | uses |
+----+------+---------------------+------+
| 1 | 1234 | 2014-08-02 00:00:00 | 5 |
| 2 | 1234 | 2014-08-03 00:00:00 | 5 |
| 3 | 1234 | 2014-08-04 00:00:00 | 3 |
| 4 | 1234 | 2014-08-05 00:00:00 | 6 |
| 5 | 1024 | 2014-08-02 00:00:00 | 8 |
| 6 | 1024 | 2014-08-03 00:00:00 | 7 |
| 7 | 1024 | 2014-08-04 00:00:00 | 4 |
| 8 | 1024 | 2014-08-05 00:00:00 | 6 |
| 9 | 1024 | 2014-09-02 00:00:00 | 1 |
| 10 | 1024 | 2014-09-03 00:00:00 | 2 |
| 11 | 1024 | 2014-09-04 00:00:00 | 3 |
| 12 | 1024 | 2014-09-05 00:00:00 | 4 |
| 13 | 5678 | 2014-08-02 00:00:00 | 8 |
| 14 | 5678 | 2014-08-03 00:00:00 | 7 |
| 15 | 5678 | 2014-08-04 00:00:00 | 4 |
| 16 | 5678 | 2014-08-05 00:00:00 | 6 |
| 17 | 5678 | 2014-09-02 00:00:00 | 1 |
| 18 | 5678 | 2014-09-03 00:00:00 | 2 |
| 19 | 5678 | 2014-09-04 00:00:00 | 3 |
| 20 | 5678 | 2014-09-05 00:00:00 | 4 |
+----+------+---------------------+------+
I want to find out the break down to see how many usage have dropped from our system in 2014-09 (aka: have usage data in 2014-08, but no longer in 2014-09), grouped by the theme. So I want to write something like:
SELECT
user_theme.theme,
SUM(user_usage.users) 'uses lost'
FROM
user_theme
LEFT JOIN user_usage
ON user_theme.user = user_usage.user
WHERE
...
GROUP BY
user_theme.theme
# HAVING ...?
And get result such as:
+-------+-----------+
| theme | uses lost |
+-------+-----------+
| A | 19 |
| B | 0 |
+-------+-----------+
Where the 19 comes from SUM(uses)
for WHERE user = 1234 AND YEAR(date) = 2014 AND MONTH(date) = 8
.
I don't know I care about the SUM(uses)
from user = 1234 in advance, because I only know I need to include user 1234 in the SUM(uses)
's WHERE clause because SUM(uses)
for WHERE user = 1234 AND YEAR(date) = 2014 AND MONTH(date) = 9
is 0.
There's actually a lot of users, and a handful of themes (around 20K users, and about 10 themes), so ideally, I think I'd like to avoid doing the filtering in code as opposed to directly in the database. Is there a way to do this effectively in MySQL using raw SQL queries?
Upvotes: 0
Views: 67
Reputation: 24229
Here is query which compares current month with previous one:
set @current_month = now();
set @previous_month = date_sub(@current_month, interval 1 month);
set @current_month = concat(year(@current_month), month(@current_month));
set @previous_month = concat(year(@previous_month), month(@previous_month));
select a.`theme`, sum(ifnull(b.uses_lost,0)) as uses_lost
from
`user_theme` as a
left outer join
(
select `user`, sum(uses) as uses_lost
from `user_usage`
where concat(year(`date`), month(`date`)) = @previous_month
and `user` not in (
select `user`
from `user_usage`
where concat(year(`date`), month(`date`)) = @current_month)
group by `user`
) as b
on (a.`user`=b.`user`)
group by a.`theme`;
main idea is to find all users who used system during last month and has no rows during current month
Upvotes: 1