Cassie
Cassie

Reputation: 3109

Array contains nulls PostgreSQL

I have SQL query in PostgreSQL which filters particular fields in the form of an array for being in bigint range. I would like to add the possibility not to filter out null values. With existing queries, null values for all of the fields are filtered out:

select *
from table_test
where '[0,2147483647]'::int8range @> ALL(ARRAY[fields])

And I would like to do something like this, only here I check against the whole array while I would want to check against each field:

select count(*) from dbm.inventory_source where '[0,2147483647]'::int8range @> ALL(ARRAY[id, exchange_id, min_cpm_micros])
or (array[id, exchange_id, min_cpm_micros]) is null

Also, I would not want to check each field for null instead I would like to check nulls for the whole array of fields. I pass the names of the fields like one string into query (called fields) and it is the reason I do not want to check each field separately. Such implementation was created to have more generic queries for multiple tables.

How can I fix this query?

Upvotes: 1

Views: 255

Answers (2)

Guest
Guest

Reputation: 182

If I understand correctly, presumably you're looking for something like this:

SELECT *
FROM table_test
WHERE '[0,2147483647]'::int8range @> ALL(ARRAY[fields]) IS NOT FALSE

(yeah, sorry, all I did was add three words and capitalize your keywords)


What's this doing? Let's start from the top.

Let's look at what we want from all the conditional stuff. Specifically, it seems we want the condition to return TRUE for every array wherein each value of the array satisfies one of these two conditions:

  1. The value falls within the range [0,2147483647]
  2. The value is NULL

It's useful here to keep in mind the exact meaning of NULL in SQL: it's a value that we don't know. This is why NULL propagates in most operations, and thinking of it that way makes it easier to predict how the database will treat it. In fact, let's replace it with ? for some examples. Why doesn't ? = ? return TRUE? It's because we don't know what either of those values are, so we don't know if they're equal, so the expression evaluates to some unknown value, NULL. What about something like ? + 1? Well, we don't know what the sum is, so it's also NULL. Similarly, ? AND TRUE depends entirely on what the first value is, and we don't know what it is, so we write NULL.

This is where it gets fun: ? AND FALSE will always be false, no matter what our unknown value is, so it evaluates to FALSE instead of NULL. Similarly, ? OR TRUE must evaluate to TRUE.

Now, revisiting our two conditions, we see that your code already checked for condition 1. What about condition 2? Well, think about how ALL works, and what it's really telling you. It's basically evaluating your condition for each entry in your array, then combining all of those with AND to tell you whether or not it's true for all of the entries. This means that your test, specifically the expression

'[0,2147483647]'::int8range @> ALL(ARRAY[fields])

returns TRUE, FALSE, or NULL for each of the entries in the array, then combines those results using AND. Since we know that

  1. TRUE AND TRUE returns TRUE
  2. NULL AND TRUE returns NULL
  3. x AND FALSE returns FALSE for any x

we can safely say that your code will return FALSE if and only if the array contains a value outside of your given range; otherwise, it will return TRUE or NULL. On the other hand, we want to get TRUE regardless of whether your code says TRUE or NULL; in other words, whenever it evaluates to anything other than FALSE. Luckily, there's a predicate for that:

IS NOT NULL

Well! That was pretty complicated to think about, but after finding the solution, it seems almost offensively simple! Way to make me feel stupid.

Check it out here.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1271121

I would like to add the possibility not to filter out null values.

Based on this, I would expect logic like:

where '[0,2147483647]'::int8range @> ALL(ARRAY[field_1, field_2, field_3]) or
      (field_1 is null and field_2 is null and field_3 is null)

I am unclear if you want to allow all values to be NULL or any of them. The above is for all of them. If you want any, change the ands to ors.

Upvotes: 2

Related Questions