Adham
Adham

Reputation: 342

Left join on substring compared with number in Oracle

I am using Oracle 11g Enterprise Edition Release 11.2.0.4.0 and I want to left join two tables on a substring of one of the columns. (see http://sqlfiddle.com/#!9/8955c8/8 for example)

Table 1 is DW_PCS:

DESC DW_PCS

Result:
PAYEE_ID      VARCHAR2
ACCOUNT_NB    VARCHAR2

Table 2 is DW_CLAIM:

DESC DW_CLAIM

Result:
CLAIM_ID      VARCHAR2
CLAIMANT_ID   NUMBER

SUBSTR(DW_PCS.PAYEE_ID, 2, 7) is an alias for CLAIMANT_ID, so I tried the following which works fine in MySQL (see http://sqlfiddle.com/#!9/8955c8/8):

SELECT 
  DW_PCS.PAYEE_ID, 
  SUBSTR(DW_PCS.PAYEE_ID, 2, 7) AS CLAIMANT_ID, 
  DW_PCS.ACCOUNT_NB,
  DW_CLAIM.CLAIM_ID 
FROM 
  DW_PCS 
  LEFT JOIN DWH.DW_CLAIM ON 
  SUBSTR(DW_PCS.PAYEE_ID, 2, 7) = DW_CLAIM.CLAIMANT_ID

But this throws ORA-12801/ORA-01722 error which apparently means that I am not using the correct data types in my comparison.

Now, if I change the condition to SUBSTR(DW_PCS.PAYEE_ID, 2, 7) = TO_CHAR(DW_CLAIM.CLAIMANT_ID), it works unless the PAYEE_ID is something like C0152426. The leading zeros become problematic because of DW_CLAIM.CLAIMANT_ID is a number. How should I make this query in Oracle?

Upvotes: 0

Views: 1100

Answers (3)

MT0
MT0

Reputation: 168137

this throws []ORA-01722 error which apparently means that I am not using the correct data types in my comparison.

No, ORA-01722 is an Invalid Number error. This means that Oracle is implicitly trying to convert the values to the same data type and is converting SUBSTR(DW_PCS.PAYEE_ID, 2, 7) to a number but at least one of the rows has non-numeric values in that substring and the conversion fails.

If this is not to be expected then you need to fix your data - you can find the offending rows using:

SELECT *
FROM   DW_PCS
WHERE  NOT REGEXP_LIKE( SUBSTR(PAYEE_ID, 2, 7), '^\d+$' )

The leading zeros become problematic because of DW_CLAIM.CLAIMANT_ID is a number. How should I make this query in Oracle?

You can solve your problem by zero-padding the start of the CLAIMANT_ID when you convert it to a string:

SELECT 
  DW_PCS.PAYEE_ID, 
  SUBSTR(DW_PCS.PAYEE_ID, 2, 7) AS CLAIMANT_ID, 
  DW_PCS.ACCOUNT_NB,
  DW_CLAIM.CLAIM_ID 
FROM 
  DW_PCS 
  LEFT JOIN DWH.DW_CLAIM ON 
  SUBSTR(DW_PCS.PAYEE_ID, 2, 7) = TO_CHAR( DW_CLAIM.CLAIMANT_ID, 'fm0000000' )

Upvotes: 0

Radagast81
Radagast81

Reputation: 3016

I think that you have some "corrupt" data, where SUBSTR(d.PAYEE_ID, 2, 7) isn't numeric. You can identify these using REGEXP_LIKE for example: WHERE REGEXP_LIKE(SUBSTR(d.PAYEE_ID, 2, 7), '\D').

Then it is your choice how to proceed with this data. One possibility is to use case in the JOIN condition and only JOIN when SUBSTR(d.PAYEE_ID, 2, 7) is numeric:

SELECT
    d.PAYEE_ID,
    SUBSTR(d.PAYEE_ID, 2, 7) AS CLAIMANT_ID,
    d.ACCOUNT_NB,
    c.CLAIM_ID
FROM DW_PCS d
LEFT JOIN DW_CLAIM c
  ON c.CLAIMANT_ID = CASE WHEN NOT REGEXP_LIKE(SUBSTR(d.PAYEE_ID, 2, 7),'\D') 
                          THEN TO_NUMBER(SUBSTR(d.PAYEE_ID, 2, 7))
                     END

P.S.: I think MySQL transforms both sides to strings for the comparison whereas Oracle tries to transform the left side to a number which results in this error. Therefore the difference...

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521794

Try casting the PAYEE_ID substring to a number:

SELECT
    d.PAYEE_ID,
    SUBSTR(d.PAYEE_ID, 2) AS CLAIMANT_ID,
    d.ACCOUNT_NB,
    dc.CLAIM_ID
FROM DW_PCS d
LEFT JOIN DWH.DW_CLAIM dc
    ON TO_NUMBER(SUBSTR(d.PAYEE_ID, 2)) = c.CLAIMANT_ID;

Using this approach, both sides of the ON comparison would be numbers, thereby avoiding the leading zero problem.

Upvotes: 0

Related Questions