Reputation: 5384
A newbie question for PostgreSQL. I have the following query in which I am trying to return a union of all records that fit the criteria of:
Is there a better way of doing this (perhaps using the WITH clause of PostgreSQL)? I have read not to us "NOT IN" in PostgreSQL, so how better could this be done? How do you "optimize" this thing?
CREATE OR REPLACE FUNCTION f_getactivedx(groupid character varying, tencounter timestamp without time zone)
RETURNS SETOF view_dx AS
$BODY$
select max(dx.recid) as recid, dx.cicd9, dx.cdesc, max( dx.tposted) as tposted,
bool_or(dx.resolved) as resolved, bool_or(dx.treated) as treated, bool_or(dx.chronic),
dx.groupid
from dx
where dx.chronic = true
and dx.groupid = $1
and date_trunc('day',dx.tposted) <= date_trunc('day',$2)
group by dx.cicd9, dx.cdesc, dx.groupid
union
select max(dx.recid) as recid, dx.cicd9, dx.cdesc, max( dx.tposted) as tposted,
bool_and(dx.resolved), bool_and(dx.treated), bool_and(dx.chronic), dx.groupid
from dx
join (select cdesc, max(tposted) as tposted from dx
where groupid =$1 and resolved = true and
date_trunc('day',tposted) <= date_trunc('day', $2)
group by cdesc) j
on (dx.cdesc = j.cdesc and dx.tposted > j.tposted)
where groupid = $1 and treated = true
and date_trunc('day',dx.tposted) <= date_trunc('day', $2)
group by dx.cicd9, dx.cdesc, dx.groupid
union
select max(dx.recid) as recid, dx.cicd9, dx.cdesc, max( dx.tposted),
bool_and(dx.resolved), bool_and(dx.treated), bool_and(dx.chronic), dx.groupid
from dx
where dx.cdesc NOT IN
(select cdesc from dx
where groupid =$1 and resolved = true and
date_trunc('day',tposted) <= date_trunc('day', $2)
group by cdesc)
and groupid =$1 and treated = true and
date_trunc('day',tposted) <= date_trunc('day', $2)
group by dx.cicd9, dx.cdesc, dx.groupid
order by tposted desc, treated desc, resolved desc, cdesc asc
Upvotes: 0
Views: 101
Reputation: 1904
Using NOT EXISTS
is often more efficient than using NOT IN
.
SELECT A.*
FROM A
WHERE NOT EXISTS (
SELECT 1
FROM B
WHERE A.id = B.id
)
Upvotes: 1
Reputation: 324475
NOT IN
can be OK, you just have to think carefully about NULL, and an anti-join is often a better choice.
For any query:
SELECT ...
FROM t
WHERE col NOT IN (SELECT col2 FROM t2 WHERE col2 IS NOT NULL AND ...predicate...)
you can equivalently write:
SELECT ...
FROM t LEFT OUTER JOIN t2 ON (t.col = t2.col2 AND ...predicate...)
WHERE t2.col2 IS NULL;
which is called a "left anti-join".
PostgreSQL may well produce the same query plan for both.
Unless you know that your NOT IN
use is causing performance issues for some reason, after proper checking with explain analyze
, I strongly suggest just leaving it alone.
Upvotes: 1