artejera
artejera

Reputation: 1496

postgres: How to protect conditional expressions from null values

After years of using Postgresql, I still don't know if there is an established best-practice on how to protect conditional expressions from null values of variables, given that SQL query planners have full authority to apply or ignore the most frequently used idiom to protect from null values: "var is null or var=0".

Allegedly, using the 'case when ... end' grammar solves any ambiguity, but also reduces maintainability, since it obscures with lots of words a simple procedure.

Thanks, in advance.

Upvotes: 1

Views: 2257

Answers (2)

artejera
artejera

Reputation: 1496

It seems I just asked a question which has forever been present. So, per de problem of NULL propagation in SQL logical expressions, with the added danger of the sql optimizer not honoring short-circuit constructs, and of evolving SQL standards, let me share what I've found so far:

  1. Read wikipedia's article on SQL NULL PROPAGATION
  2. Use coalesce() around any column name with possible null values, involved in any calculation within a sql statement (thanks Igor).
  3. Also use 'is [not] distinct from' instead of '=' or '<>'

Upvotes: 0

joanolo
joanolo

Reputation: 6328

I think you have a missconception arising from comparing SQL to Java (or C, C++, or any language dealing with references or pointers).

You don't need to protect conditional expressions from NULL values when working with SQL.

In SQL, you do not have (hidden) pointers (or references) to objects that should be tested against NULL or otherwise they cannot be dereferenced. In SQL, every expression produces a certain value of a certain type. This value can be NULL (also called UNKNOWN).

If your var is NULL, then var = 0 will evaluate to NULL (unknown = 0 gives back unknown). Then var IS NULL (unknown is unknown) will evaluate to TRUE. And, according to three-value logic, TRUE or UNKNOWN evaluates to TRUE. No matter which is the order of evaluation, the result is always the same.

You can check it just by evaluating:

SELECT
    /* var */ NULL = 0 as null_equals_zero,
    /* var */ NULL IS NULL as null_is_null,
    TRUE or NULL AS true_or_null,
    (NULL = 0) OR (NULL IS NULL) AS your_case_when_var_is_null,
    (NULL IS NULL) OR (NULL = 0) AS the_same_reordered
    ;

Returns

null_equals_zero | null_is_null | true_or_null | your_case_when_var_is_null | the_same_reordered
:--------------- | :----------- | :----------- | :------------------------- | :-----------------
null             | t            | t            | t                          | t                 

dbfiddle here

Given var = 0, NULL and 1 (<> 0); you'll get:

WITH vals(var) AS
(
    VALUES
    (0),
    (NULL),
    (1)
)
SELECT
    var,
    var = 0 OR var IS NULL  AS var_equals_zero_or_var_is_null,
    var IS NULL OR var = 0  AS var_is_null_or_var_equals_zero,
    CASE WHEN var IS NULL then true
         WHEN var = 0 then true
         ELSE false
    END                     AS the_same_with_protection
FROM
    vals ;
 var | var_equals_zero_or_var_is_null | var_is_null_or_var_equals_zero | the_same_with_protection
---: | :----------------------------- | :----------------------------- | :-----------------------
   0 | t                              | t                              | t                       
null | t                              | t                              | t                       
   1 | f                              | f                              | f                       

dbfiddle here

These are the basic truth tables for the different operators (NOT, AND, OR, IS NULL, XOR, IMPLIES) using three-valued logic, and checked with SQL:

WITH three_values(x) AS
(
VALUES 
   (NULL), (FALSE), (TRUE)
)
SELECT
   a, b, 
   a = NULL AS a_equals_null,  -- This is alwaus NULL
   a IS NULL AS a_is_null,     -- This is NEVER NULL
   a OR  b AS a_or_b,          -- This is UNKNOWN if both are
   a AND b AS a_and_b,         -- This is UNKNOWN if any is
   NOT a   AS not_a,           -- This is UNKNOWN if a is
   (a OR b) AND NOT (a AND b) AS a_xor_b,   -- Unknown when any is unknown
   /* (a AND NOT b) OR (NOT a AND b) a_xor_b_v2, */
   NOT a OR b  AS a_implies_b  -- Kleener and Priests logic
FROM
   three_values AS x(a)
   CROSS JOIN
   three_values AS y(b);

This is the truth table:

a    | b    | a_equals_null | a_is_null | a_or_b | a_and_b | not_a | a_xor_b | a_implies_b
:--- | :--- | :------------ | :-------- | :----- | :------ | :---- | :------ | :----------
null | null | null          | t         | null   | null    | null  | null    | null       
null | f    | null          | t         | null   | f       | null  | null    | null       
null | t    | null          | t         | t      | null    | null  | null    | t          
f    | null | null          | f         | null   | f       | t     | null    | t          
f    | f    | null          | f         | f      | f       | t     | f       | t          
f    | t    | null          | f         | t      | f       | t     | t       | t          
t    | null | null          | f         | t      | null    | f     | null    | null       
t    | f    | null          | f         | t      | f       | f     | t       | f          
t    | t    | null          | f         | t      | t       | f     | f       | t          

dbfiddle here

Upvotes: 2

Related Questions