Reputation: 1850
Say I have two tables: user
and diary
.
select count(id)
from user
where is_tester is false
Here the primary keys are always called id
. This query above gives me about 270000, which means I have about 270000 users. Now I'd like to know how many diaries each user has. So I went:
select u.id as user_id, u.dm_type, count(d.id) as bg_count
from diary as d
right join (
select id, dm_type
from user
where is_tester is false
) as u
on d.user_id = u.id
where d.glucose_value > 0
group by u.id, u.dm_type
Each user can have only one kind of dm_type
. I am expecting that it would tell me how many diaries each user has, and if one doesn't have any diary, it would give me an NA or 0 since I used right join
. However, the returned table has only about 75000 rows and every users in that table have at least one diary. That's not what I want. Why does it happen and how should I do it correctly?
I've referred to Combining RIGHT JOIN with COUNT, and count one specific field according to the suggestion from the accepted answer.
Edits according to the comments:
user
:
| id | dm_type | is_tester |
|----|---------|-----------|
| 1 | 1 | False |
| 2 | 1 | False |
| 3 | 2 | False |
| 4 | no | False |
| 5 | 2 | True |
diary
:
| id | user_id | glucose_value |
|----|---------|---------------|
| 1 | 1 | -2 |
| 2 | 1 | 80 |
| 3 | 2 | 78 |
| 4 | 2 | 100 |
| 5 | 4 | 83 |
| 6 | 5 | 90 |
Expected result:
| user_id | dm_type | bg_count |
|---------|---------|----------|
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 0 |
| 4 | no | 1 |
Upvotes: 1
Views: 421
Reputation: 1270843
The problem with your query is the where
clause. It is filtering out unmatched diaries from the outer join.
When you are approaching this type of problem, I strongly recommend left join
over right join
. It means "keep all the rows in the first table even if nothing matches in the second". This is usually easier to follow than "keep all the rows in whatever table is at the end of the from clause but I haven't seen yet".
The next rule is that conditions on the first table go in the where
clause because they really filter rows. Conditions on the second table go in the on
clause. They don't filter rows but are used for matching.
So, you can phrase your query as:
select u.id as user_id, u.dm_type, count(d.id) as bg_count
from user u left join
diary d
on d.user_id = u.id and d.glucose_value > 0
where u.is_tester is false
group by u.id, u.dm_type;
No subquery is needed.
Upvotes: 0
Reputation: 37483
Try with left join, it will give you all user whether it has diary count or not If any user has no diary then it will give you null
select u.id as user_id, u.dm_type, count(d.id) as bg_count from
(select id, dm_type from user where is_tester is false)u
left join diary d on d.user_id = u.id and d.glucose_value > 0
group by u.id, u.dm_type
Upvotes: 1