Reputation: 1232
This seems like it should be simple but I guess I'm missing something.
I've got a query where I'm selecting a bunch of columns from my DB, and I want to exclude any rows where (for example) the province = 'ontario' or 'quebec'. I've tried it many ways, with different results - none of which are what I want.
select .... from fulldata
WHERE substr(code, 1, 5) <> '10000'
AND (province <> 'ontario' OR province <> 'quebec')
This does work to exclude rows where province is one of those values, but it also excludes all rows where province is null/blank - no idea why.
I tried AND NOT ( stateprovince = 'ontario' OR stateprovince = 'quebec')
with the same result.
I tried AND stateprovince <> ANY(VALUES 'ontario','quebec')
and it doesn't have any impact (all data shows up, nothing is filtered out)
Even breaking it down to something as simple as AND stateprovince <> 'ontario'
does get rid of ontario data, but also get's rid of blanks which isn't the intended result.
In writing this up I've kind of come up with a workaround which is to change the query to this:
...
AND (stateprovince IS NULL OR stateprovince <> 'ontario')
AND (stateprovince IS NULL OR stateprovince <> 'quebec')
Which works, but doesn't seem like the "right" way. Any idea what's up?
I'm using Presto on Athena.
Upvotes: 0
Views: 4621
Reputation: 20730
When province
is NULL
(think: "unknown") then this:
(province <> 'ontario' OR province <> 'quebec')
will evaluate to NULL
, filtering out given row.
Since NOT (NULL)
is also NULL
, there isn't a simple way to avert this.
Thus you need an explicit NULL
check:
(province IS NULL OR province NOT IN('ontario', 'quebec'))
Alternatively you can write this this way:
(coalesce(province, '') NOT IN('ontario', 'quebec'))
Upvotes: 1