H Jansen
H Jansen

Reputation: 319

Budget and actual rows not merged on full outer join

The following query:

select coalesce(to_number(bl.division_code), bud.division) division
,      coalesce(bud.glaccountcode, bl.costcenter_costanalysis_period_periods_year_years_balance_code_attr) glaccountcode
,      coalesce(bud.costcenter, bl.costcenter_code_attr) costcenter
,      coalesce(bud.costunit, bl.code_attr) costunit
,      coalesce(bud.reportingyear, bl.costcenter_costanalysis_period_periods_year_reportingyear_attr) reportingyear
,      coalesce(bud.reportingperiod, bl.costcenter_costanalysis_period_reportingperiod_attr) reportingperiod
,      case when bud.amountdc > 0 then 456 else null end budgetamountdc label 'Budget (anonymized, EUR)'
,      case when bl.balance > 0 then 123 else null end actualsamountdc label 'Actuals (anonymized, EUR)'
,      case
       when bl.division_code is null
       then 'budget'
       when bud.division is null
       then 'balancelines'
       else 'both'
       end
       label 'Source'
from   exactonlinexml..balancelinesperperiodcostanalysis bl
full
outer
join   exactonlinerest..budgets bud
on     bud.division        = to_number(bl.division_code)
and    bud.glaccountcode   = bl.costcenter_costanalysis_period_periods_year_years_balance_code_attr
and    bud.costcenter      = bl.costcenter_code_attr
and    bud.costunit        = bl.code_attr
and    bud.reportingyear   = bl.costcenter_costanalysis_period_periods_year_reportingyear_attr
and    bud.reportingperiod = bl.costcenter_costanalysis_period_reportingperiod_attr

joins actual transactions on General Ledger accounts with associated budgets on a detailed level:

I expect at most one row of data per combination of those dimensions. However, for some combination, 2 rows are returned. One of these rows has a label 'budget' whereas the other one has 'balancelines'.

It seems that somehow they are not merged together in the coalesce:

Invantive Query Tool output

The contents of gl account 5050 in balance lines of period 1 in 2019 is one row with a certain amount (not equal 0).

The contents of GL account 5050 in budgets of period 1 in 2019 is also one row with a certain amount (not equal 0).

I seem unable to find why the rows are not merged together by the full outer join and coalesce.

What am I doing wrong?

Upvotes: 1

Views: 47

Answers (1)

Guido Leenders
Guido Leenders

Reputation: 4262

You are using a join on 6 dimensions. Each of these dimensions is part of the primary key of balances and also of budgets. However, some of these dimensions can contain null values. Null is something special with SQL logic and defines an unknown value of unknown type (there can be various types of unknown values). SQL uses a three-valued logic (Mr. Brouwer says hello). See the output of the query

select 1=1 are_they_equal_bool1
,      1=0 are_they_equal_bool2
,      null=null are_they_equal_bool3

it clearly shows that null=null evaluates to something "gray", meaning null: unknown whether it is true or false:

null=null has unknown (null) outcome

You need to compensate for null values in your join. In this case you will probably define NULL for cost center and cost unit as a financial amount not related to a cost center and/or cost unit. In that case, the null in balance lines and the null in budgets have the same semantics.

The clean route is to adapt the original join conditions:

and    bud.costcenter      = bl.costcenter_code_attr
and    bud.costunit        = bl.code_attr

to have null in both columns to have an identical meaning:

and    ( ( bud.costcenter is null and bl.costcenter_code_attr is null )
         or 
         bud.costcenter      = bl.costcenter_code_attr 
       )
and    ( ( bud.costunit is null and bl.code_attr is null )
         or 
         bud.costunit      = bl.code_attr 
       )

Upvotes: 1

Related Questions