Reputation: 11
I am learning SQL for a personal projects and seems that I don't quite get the COUNT function.
I have a "sample" table with this sample data:
What I am attempted to do is print out only those NAME values that have only one COLOR value which is yellow.
Here is the query I wrote but Bob is coming out, which is a mistake.
SELECT COUNT(NAME),NAME
FROM SAMPLE
WHERE (COLOR = 'yellow')
HAVING COUNT(*) = 1
GROUP BY NAME;
May someone tell me what I am doing incorrectly?
Thanks.
Upvotes: 1
Views: 4955
Reputation: 52336
Another method using an analytic function:
SELECT NAME
FROM (
SELECT NAME,
COLOR,
COUNT(*) OVER () ROWS_PER_NAME
FROM SAMPLE )
WHERE COLOR = 'yellow' AND
ROWS_PER_NAME = 1
Also, if very few NAME's had the color yellow I would try:
SELECT NAME,
COLOR
FROM SAMPLE P
WHERE COLOR = 'yellow' AND
NOT EXISTS (
SELECT null
FROM SAMPLE C
WHERE C.NAME = P.NAME
AND COLOR != 'yellow')
Upvotes: 1
Reputation: 881153
It's because your where
clause is limiting the result set before the having
clause is checking.
Hence you are stripping out bob red
so that the only bob
left is the yellow
one. And it will have a count of 1.
This one works for me (albeit in DB2 but since I tend to prefer standard SQL, it should work on any DBMS):
select count(a.name), a.name
from sample a,
(select name from sample where color = 'yellow') b
where a.name = b.name
group by a.name
having count(a.name) = 1;
Yellow returns (no Bob):
--------
NAME
--------
1 Jerry
1 Keri
while red returns (no Tom or Bob):
-------
NAME
-------
1 Paul
The way this works is as follows:
count
aggregate function to combine rows with the same name and give us a count of the colors for each name.I'm assuming here that you won't have a row in the table with a duplicate name and color - in other words, you should have a primary key or other constraint on (name,color). If you did have duplicates, then the cross-join would result in more rows and you would have to use distinct
or group by
in the subquery).
Upvotes: 3
Reputation: 562250
Try this:
SELECT COUNT(NAME),NAME
FROM SAMPLE
GROUP BY NAME
HAVING COUNT(*) = 1 AND MAX(COLOR) = 'yellow';
As @paxdiablo said, you need to leave the rows in the group until after you do the group by, so the count will be accurate. Then you can test for 'yellow'
in the HAVING
clause.
Even though it may seem redundant to use MAX()
like I did in the above example, it's good form because any expression in the HAVING
clause should use group-oriented functions. HAVING
restricts groups whereas WHERE
restricts rows.
Upvotes: 4