MikeRSpencer
MikeRSpencer

Reputation: 1316

GROUP BY HAVING NOT generates wrong result

I'm running a GROUP BY and HAVING NOT query on an SQLite database but find this doesn't exclude groups that contain the NOT, merely the rows. Example below:

-- Create sample db
CREATE TABLE sample(Year INT, Var INT);
-- Generate data
INSERT INTO sample (Year, Var)
VALUES (2005, 1);
INSERT INTO sample (Year, Var)
VALUES (2005, 3);
INSERT INTO sample (Year, Var)
VALUES (2006, 2);
INSERT INTO sample (Year, Var)
VALUES (2006, 3);
INSERT INTO sample (Year, Var)
VALUES (2007, 1);
INSERT INTO sample (Year, Var)
VALUES (2007, 2);

If I run the following query this returns only the groups 2005 and 2007, because they are the only groups that contain 1:

SELECT * FROM sample
GROUP BY Year, Var
HAVING Var = 1

If I change this query to exclude groups that contain 1, I am returned all groups (even those that contain 1), but not the rows that contain 1.

SELECT * FROM sample
GROUP BY Year, Var
HAVING Var != 1

In essence the HAVING clause now seems to behave as a WHERE clause. What should happen is the above two queries should generate inverse results, i.e. the first selects only Year 2005 and 2007 and the second selects only Year 2006.

My question is how do I select only groups that do NOT contain a given value in the Var column, and latterly is the above a bug that I should file?

So I desire/expect the the second query to only return the group/Year 2006 as it is the only one not to contain a Var of 1.

Upvotes: 2

Views: 150

Answers (2)

ngrashia
ngrashia

Reputation: 9904

Below query would give you the result:

SELECT * FROM sample
WHERE VAR !=1
GROUP BY Year, Var

However, you could as well use,

SELECT * FROM SAMPLE WHERE VAR !=1

since group by has no significance for your table and query specified.

FIDDLE here for both queries.


EDIT:

This would help you. Refer 3rd query in updated FIDDLE

SELECT * FROM SAMPLE
WHERE YEAR NOT IN
(
    SELECT YEAR FROM SAMPLE WHERE VAR=1
)

Upvotes: 1

ah_hau
ah_hau

Reputation: 768

SELECT Year, count(var) FROM sample
WHERE Var != 1
GROUP BY Year

This should be return your expected result. Take a look on the diff query and understand the concept of group by. You do not need to group the Var column like what @CL said.

Addded Fiddle example

Upvotes: 0

Related Questions