Ron Piggott
Ron Piggott

Reputation: 705

pgSQL FULL OUTER JOIN 'WHERE' Condition

I am trying to create a single query to retrieve the current price and special sale price if a sale is running; When there isn't a sale on I want store_picture_monthly_price_special.price AS special_price to return as null.

Before adding the 2nd WHERE condition the query executes as I expect it to: store_picture_monthly_price_special.price returns null since there is no sale running at present.

 store_picture_monthly_reference | tenure |   name    | regular_price | special_price 
                               3 |     12 | 12 Months |        299.99 | {Null}             
                               2 |      3 | 3 Months  |         79.99 | {Null}            
                               1 |      1 | 1 Month   |         29.99 | {Null}            

pgSQL is treating the 2nd WHERE condition as "all or none". If there is no sale running there are no results.

Is it possible to tweak this query so I get regular pricing each and every time and special sale price either as a dollar value when a special is running or returning null Is what I am trying to do be accomplished require sub-query?

This is the query how I presently have it:

SELECT store_picture_monthly.reference AS store_picture_monthly_reference , store_picture_monthly.tenure , store_picture_monthly.name , store_picture_monthly_price_regular.price AS regular_price , store_picture_monthly_price_special.price AS special_price

FROM ( store_picture_monthly INNER JOIN store_picture_monthly_price_regular ON store_picture_monthly_price_regular.store_picture_monthly_reference = store_picture_monthly.reference )

FULL OUTER JOIN store_picture_monthly_price_special ON store_picture_monthly.reference = store_picture_monthly_price_special.store_picture_monthly_reference

WHERE 
    ( store_picture_monthly_price_regular.effective_date < NOW() ) 
    AND
    ( NOW() BETWEEN store_picture_monthly_price_special.begin_date AND store_picture_monthly_price_special.end_date )

GROUP BY store_picture_monthly.reference , store_picture_monthly_price_regular.price , store_picture_monthly_price_regular.effective_date , store_picture_monthly_price_special.price

ORDER BY store_picture_monthly_price_regular.effective_date DESC

Table "store_picture_monthly"

reference bigint,
name text,
description text,
tenure bigint,
available_date timestamp with time zone,
available_membership_reference bigint

Table store_picture_monthly_price_regular

reference bigint ,
store_picture_monthly_reference bigint,
effective_date timestamp with time zone,
price numeric(10,2),
membership_reference bigint

Table store_picture_monthly_price_special

reference bigint,
store_picture_monthly_reference bigint,
begin_date timestamp with time zone,
end_date timestamp with time zone,
price numeric(10,2),
created_date timestamp with time zone DEFAULT now(),
membership_reference bigint

Upvotes: 0

Views: 1539

Answers (3)

Francky
Francky

Reputation: 41

I usually use CTE to make full join with subsets eg:

with subt1 as (select * from t1 where ...)
select * from subt1 
full join t2 on ....

Performance is not good because the full join doesn't use the t1 table's indexes but it can help.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270843

The description of the problem suggests that you want a LEFT JOIN, not a FULL JOIN. FULL JOINs are quite rare, particularly in databases with well defined foreign key relationships.

In your case, the WHERE clause is turning your FULL JOIN into a LEFT JOIN anyway, because the WHERE clause requires valid values from the first table.

SELECT spm.reference AS store_picture_monthly_reference, 
       spm.tenure, spm.name, 
       spmpr.price AS regular_price,
       spmps.price AS special_price
FROM store_picture_monthly spm INNER JOIN
     store_picture_monthly_price_regularspmpr
     ON spmpr.store_picture_monthly_reference = spm.reference LEFT JOIN
     store_picture_monthly_price_special spmps 
     ON spm.reference = spmps.store_picture_monthly_reference AND
        NOW() BETWEEN spmps.begin_date AND spmps.end_date
WHERE spmpr.effective_date < NOW();

Notes:

  • I introduced table aliases so the query is easier to write and to read.
  • The condition on the dates for the sale are now in the ON clause.
  • I removed the GROUP BY. It doesn't seem necessary. If it is, you can use SELECT DISTINCT instead. And, I would investigate data problems if this is needed.
  • I am suspicious about the date comparisons. NOW() has a time component. The naming of the comparison columns suggests that the are just dates with no time.

Upvotes: 1

Caius Jard
Caius Jard

Reputation: 74700

Any time that you put a where predicate on a table that is outer joined it converts the outer join to an inner join because the nulls introduced by the outer join can never be compared to anything to produce a true (so the outer join puts a load of rows-with-nulls in where rows don't match, and then the WHERE takes the entire row out again)

Consider this simpler example:

SELECT * FROM 
a LEFT JOIN b ON a.id = b.id
WHERE b.col = 'value'

Is identical to:

SELECT * FROM 
a INNER JOIN b ON a.id = b.id
WHERE b.col = 'value'

To resolve this, move the predicate out of the where and into the ON

SELECT * FROM 
a LEFT JOIN b ON a.id = b.id AND b.col = 'value'

You can also consider:

SELECT * FROM 
a LEFT JOIN b ON a.id = b.id
WHERE b.col = 'value' OR b.col IS NULL

but this might pick up data you don't want, if b.col naturally contains some nulls; it cannot differentiate between nulls that are natively present in b.col and nulls that are introduced by a fail in the join to match a row from b with a row from a (unless we also look at the nullness of the joined id column)

A
id
1
2
3

B
id, col
1, value
3, null

--wrong, behaves like inner
A left join B ON a.id=b.id WHERE b.col = 'value'
1, 1, value

--maybe wrong, b.id 3 might be unwanted
A left join B ON a.id=b.id WHERE b.col = 'value' or b.col is null
1, 1, value
2, null, null
3, 3, null

--maybe right, simpler to maintain than the above
A left join B ON a.id=b.id AND b.col = 'value' 
1, 1, value
2, null, null
3, null, null

In these last two the difference is b.id is null or not, though the row count is the same. If we were counting b.id our count could end up wrong. It's important to appreciate this nuance of join behavior. You might even want it, if you were looking to exclude row 3 but include row 2, by crafting a where clause of a LEFT JOIN b ON a.id=b.id WHERE b.col = 'value' OR b.id IS NULL - this will keep row 2 but exclude row 3 because even though the join succeeds to find a b.id of 3 it is not kept by either predicate

Upvotes: 1

Related Questions