DaveB
DaveB

Reputation: 3083

postgresql where condition returns at least one result

postgreSQL question...I have an update query below which updates a column with the results from a subquery, however in some cases the subquery will return null which throws the 'not null' constraint on the column, how can I get it to NOT update if the subquery returns null?

I have tried EXISTS but this only seems to work on a WHERE clause?

UPDATE user_stats as stats
SET ave_price = (
    SELECT AVG(l.price)
    FROM lengths as l, user_sessions as us
    WHERE l.product_type = 'car'
    AND l.session_id = us.session_id
    AND stats.user_id = us.user_id
)

Upvotes: 2

Views: 3257

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656471

Use a an actual subquery to select from instead of a subquery expression:

UPDATE user_stats s
SET    ave_price = x.ave_price
FROM  (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    ) x
WHERE x.user_id = s.user_id;

This will be faster, too. If you have a relevant proportion of user_id that exists in the table user_sessions, but not in user_stats, then the following query might be faster (while both yield the same result in every case):

UPDATE user_stats s
SET    ave_price = x.ave_price
FROM  (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_stats usr USING (user_id)
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    ) x
WHERE x.user_id = s.user_id;

The point of the second version is to exclude irrelevant rows early. The same query written with a CTE (somewhat more elegant and readable):

WITH x AS (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_stats usr USING (user_id)
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    )
UPDATE user_stats s
SET    ave_price = x.ave_price
FROM   x
WHERE  x.user_id = s.user_id;

Be advised that while CTE for SELECT queries were introduced with PostgreSQL 8.4, CTE for data modification commands were only introduced with PostgreSQL 9.1:

Allow data-modification commands (INSERT/UPDATE/DELETE) in WITH clauses

Upvotes: 1

xQbert
xQbert

Reputation: 35323

coalesce, nvl, ifnull in most db engines will do a conditional statement that says take the first non-null value in the string in this case when the subselect returns null it will set the ave_price = to itself.

UPDATE user_stats as stats
SET ave_price = coalesce((
    SELECT AVG(l.price)
    FROM lengths as l, user_sessions as us
    WHERE l.product_type = 'car'
    AND l.session_id = us.session_id
    AND stats.user_id = us.user_id
),ave_price)

This doesn't prevent the udpate as requested, but it has a similar effect on the data.

For more info on coalesce see: PostgreSQL

To actually prevent the update you would needto add a where clause on the update and re-execute the sub query such as:

    UPDATE user_stats as stats
    SET ave_price = (
        SELECT AVG(l.price)
        FROM lengths as l, user_sessions as us
        WHERE l.product_type = 'car'
        AND l.session_id = us.session_id
        AND stats.user_id = us.user_id)
WHERE (SELECT AVG(l.price)
        FROM lengths as l, user_sessions as us
        WHERE l.product_type = 'car'
        AND l.session_id = us.session_id
        AND stats.user_id = us.user_id) is not null

Logically executing the subquery twice would impact performance twice; whereas the coalesce only requires execution once. There's always multiple ways to do things and depending on requirements, one must choose which option serves them best.

Upvotes: 4

Related Questions