NikM
NikM

Reputation: 173

BigQuery - REGEXP_CONTAINS removes NULL value

I'm trying to exclude rows where website contain xyz and 123. However the query is also excluding rows with NULL value. So instead of 4 rows, it returns 3 rows i.e. number 6 is excluded.

Can someone tell what is causing the problem and the correct solution?

I don't want to use NOT IN since my exclusion criteria is a long list, so I think REGEXP_CONTAINS is the only way out. If there's any other string function, then please suggest.

Query Used:

SELECT number,source,detail,website FROM `tablename` where not REGEXP_CONTAINS(website, r'xyz') and not REGEXP_CONTAINS(website, r'123')

enter image description here

Upvotes: 0

Views: 1746

Answers (2)

rmesteves
rmesteves

Reputation: 4075

The BigQuery's regexp_contains function when applied to NULL values returns NULL. In your case, when you apply AND and NOT operations to a NULL value, the value is not changed. Despite that, the WHERE clause sees a NULL value as false.

If you want to keep the false value, you should do:

SELECT number,source,detail,website FROM `tablename` where (not REGEXP_CONTAINS(website, r'xyz') and not REGEXP_CONTAINS(website, r'123')) or website is NULL

I hope it helps

Upvotes: 2

Pentium10
Pentium10

Reputation: 207838

It's expected to exclude NULL.

If you want NULL rows you need to add an OR LOGIC columns is null.

Upvotes: 0

Related Questions