Alan Wayne
Alan Wayne

Reputation: 5384

Can JOIN operations be ordered in a CTE? (PostgreSQL)

PostgreSQL 9.5

The below CTE works correctly to move records from tables CPT and CPT_INVOICE to DOCTOR_PROCEDURES and update DOCTORBILLING uid accordingly. However, CPT_INVOICE has a foreign key to its parent, CPT, so this script fails until that foreign key relationship is removed.

Is there any way of forcing PostgreSQL to execute CTE's in a specific order, i.e., to first execute planC before planB?

TIA

WITH planA AS (
    select cpt_recid from doctorbilling
),
planC as (
    delete from cpt_invoice D
    USING planA a
    where D.recid = A.cpt_recid
    returning D.cpt_recid, D.ninsurance, D.ncash, D.mustschedule, D.doneinoffice
),
planB as (
    delete from cpt C
    USING planA A
    where C.recid = A.cpt_recid
    returning C.recid as cpt_recid, C.code, C.cdesc, C.procedure_type, C.sex
),
planD as (
    insert into doctor_procedures (code, cdesc, procedure_type, sex, ninsurance, ncash, mustschedule, doneinoffice, cpt_recid)
    select distinct on (b.cdesc)  b.code, b.cdesc, b.procedure_type, b.sex, c.ninsurance, c.ncash, c.mustschedule, c.doneinoffice, a.cpt_recid
    from planA A
    join planB B on B.cpt_recid = A.cpt_recid
    left join planC C on C.cpt_recid = A.cpt_recid   -- there may not be a cpt_invoice for the cpt_recid.
    order by b.cdesc
    returning cpt_recid, uid
)
update doctorbilling T
set uid = D.uid
from planD D
where T.cpt_recid = D.cpt_recid

Upvotes: 3

Views: 393

Answers (2)

Patrick
Patrick

Reputation: 32199

The easiest solution would be to make plan B dependent on plan C (with plan A removed):

WITH planC as (
    delete from cpt_invoice D
    USING doctorbilling A
    where D.recid = A.cpt_recid
    returning D.recid, D.cpt_recid, D.ninsurance, D.ncash, D.mustschedule, D.doneinoffice
),
planB as (
    delete from cpt C
    USING planC X
    where C.recid = X.recid
    returning C.recid as cpt_recid, C.code, C.cdesc, C.procedure_type, C.sex
),
planD as (
    insert into doctor_procedures (code, cdesc, procedure_type, sex, ninsurance, ncash, mustschedule, doneinoffice, cpt_recid)
    select distinct on (b.cdesc)  b.code, b.cdesc, b.procedure_type, b.sex, c.ninsurance, c.ncash, c.mustschedule, c.doneinoffice, b.cpt_recid
    from planB B
    left join planC C on C.cpt_recid = B.cpt_recid   -- there may not be a cpt_invoice for the cpt_recid.
    order by b.cdesc
    returning cpt_recid, uid
)
update doctorbilling T
set uid = D.uid
from planD D
where T.cpt_recid = D.cpt_recid;

This all looks a tad odd, though, because you are doing largely unqualified data modification statements on all rows from table doctorbilling. In practice, you are more likely to move a single cpt_recid at a time, which would make the query quite a bit more straightforward:

WITH planC as (
    delete from cpt_invoice D
    where D.recid = <<cpt_recid>>
    returning D.recid, D.cpt_recid, D.ninsurance, D.ncash, D.mustschedule, D.doneinoffice
),
planB as (
    delete from cpt C
    USING planC X
    where C.recid = X.recid -- maintain dependency
    returning C.code, C.cdesc, C.procedure_type, C.sex
),
planD as (
    insert into doctor_procedures (code, cdesc, procedure_type, sex, ninsurance, ncash, mustschedule, doneinoffice, cpt_recid)
    select distinct on (b.cdesc)  b.code, b.cdesc, b.procedure_type, b.sex, c.ninsurance, c.ncash, c.mustschedule, c.doneinoffice, b.cpt_recid
    from planB B
    left join planC C on true   -- there may not be a cpt_invoice for the cpt_recid.
    order by b.cdesc
    returning uid
)
update doctorbilling T
set uid = D.uid
from planD D
where T.cpt_recid = <<cpt_recid>>;

Even better would be a PL/pgSQL function:

CREATE FUNCTION move_recid (id integer) RETURNS void AS $$
DECLARE
    ... -- declare all variables
BEGIN
    delete from cpt_invoice
    where recid = id
    returning cpt_recid, ninsurance, ncash, mustschedule, doneinoffice
         into inv_recid, inv_ins, inv_cash, inv_sch, inv_doi;

    delete from cpt
    where recid = id
    returning code, cdesc, procedure_type, sex
         into cpt_code, cpt_desc, cpt_proc, cpt_sex;

    insert into doctor_procedures (code, cdesc, procedure_type, sex, ninsurance, ncash,
                                   mustschedule, doneinoffice, cpt_recid)
    values (cpt_code, cpt_desc, ...)
    returning uid into dp_uid;

    update doctorbilling
    set uid = dp_uid
    where cpt_recid = id;
END;
$$ LANGUAGE plpgsql STRICT;

Order guaranteed. Easy to understand by fellow programmers, easy to maintain.

Upvotes: 1

Chris Travers
Chris Travers

Reputation: 26464

First, generally if you heave heavy order dependencies you are better off solving this in other ways. SQL is a declarative language and has no concept of ordering so anything we do here is dependent on implementation details rather than standard expected behavior. Your best bet would be to wrap in a user defined function with the logic more clearly broken up. As an alternative you could mark the foreign key constraint DEFERRABLE and set it DEFERRED just before running this query (then set it IMMEDIATE after the query). Those would be the best options and the correct way to solve your problem.

On to your specific desired solution. Your problem here comes not from a need to order CTEs generally but more generally from a need to order join operations. I think the following might be safe in this specific circumstance but I am not entirely sure (i.e a smarter planner might break it in the future).

planD as (
    insert into doctor_procedures (code, cdesc, procedure_type, sex, ninsurance, ncash, mustschedule, doneinoffice, cpt_recid)
    select distinct on (b.cdesc)  b.code, b.cdesc, b.procedure_type, b.sex, c.ninsurance, c.ncash, c.mustschedule, c.doneinoffice, a.cpt_recid
    from planA A
    left join planC C on C.cpt_recid = A.cpt_recid   -- there may not be a cpt_invoice for the cpt_recid.
    join planB B on B.cpt_recid = A.cpt_recid OR B.cpt_recid = c.cpt_recid
    order by b.cdesc
    returning cpt_recid, uid
)

The reason I am not entirely confident in this being a long term solution is that a smarter hypothetical planer might be able to note that c.cpt_recid will always equal a.cpt_recid and therefore the OR clause is always redundant.

Upvotes: 1

Related Questions