Lucy82
Lucy82

Reputation: 693

Extract CLOB data for insert

I have CLOB data like this:

123456 (LED TV); 234543 (LED light); 654876 (LED monitor);

Now I need to extract all 6-digit numbers (invoice tracking number in my case) from CLOB using delimiter ( in my case ";" ) and select that for Insert into target table but only If record doesn't exist.

I've seen couple of examples with Instr & Substr or Regexp, but nothing what I need or beyond my understanding of Oracle. Can somebody show me an example of how to split CLOB into rows based on string inside CLOB, to make use of It for later Insert ?

P.S.: I would prefer fastest solution since my CLOB data could contain over 5 million of invoice records. It's going to be a stored procedure fired from C# at the end, but this part is causing me headaches... For any help - thanks in advance !

Upvotes: 0

Views: 1270

Answers (2)

Popeye
Popeye

Reputation: 35900

I have tried to complete the task using DBMS_LOB package to convert them to strings divided by ";" and then did some string operations on it to achieve the result.

Try following:

INSERT INTO INVOICE_CATEGORIZED 
SELECT TAB.INVOICE_NUMBER, TAB.INVOICE_NAME FROM
(SELECT 
TRIM(dbms_lob.SUBSTR(INVOICE_INN,6 ,1)) AS INVOICE_NUMBER, 

SUBSTR(INVOICE_INN, 
INSTR(INVOICE_INN, '(') + 1,
INSTR(INVOICE_INN, ')') - INSTR(INVOICE_INN, '(') - 1 )
 AS INVOICE_NAME

-- HERE INVOICE_INN IS STRING NOW, SO WE CAN DO STRING OPERATIONS ON IT ONWARD

FROM
(
-- DIVIDING ; SEPARATED CLOB TO INDIVIDUAL STRING
SELECT
    TRIM(CASE WHEN INVOICE_SINGLE.COLUMN_VALUE = 1 THEN
    dbms_lob.SUBSTR(INVOICE, 
    dbms_lob.INSTR(INVOICE,';',1,INVOICE_SINGLE.COLUMN_VALUE) - 1,
    1 
    )
    ELSE
    dbms_lob.SUBSTR(INVOICE, 
    dbms_lob.INSTR(INVOICE,';',1,INVOICE_SINGLE.COLUMN_VALUE) - 1
    - dbms_lob.INSTR(INVOICE,';',1,INVOICE_SINGLE.COLUMN_VALUE - 1),
    dbms_lob.INSTR(INVOICE,';',1,INVOICE_SINGLE.COLUMN_VALUE - 1) + 1)
    END) AS INVOICE_INN
FROM
    INVOICES T,
    TABLE ( CAST(MULTISET(
        SELECT
            LEVEL
        FROM
            DUAL
        CONNECT BY
            dbms_lob.INSTR(INVOICE,';',1,LEVEL) <> 0
    ) AS SYS.ODCINUMBERLIST) ) INVOICE_SINGLE)) TAB 
     WHERE NOT EXISTS (SELECT 1 FROM INVOICE_CATEGORIZED IC
    WHERE IC.INVOICE_NUMBER  = TAB.INVOICE_NUMBER
    AND IC.INVOICE_NAME = TAB.INVOICE_NAME)

db<>fiddle demo

Cheers!!

Upvotes: 1

Littlefoot
Littlefoot

Reputation: 142720

Here's an example.

Test case first; test table contains source data:

SQL> create table test (col clob);

Table created.

SQL> insert into test
  2    select '123456 (LED TV); 234543 (LED light); 654876 (LED monitor);' from dual union all
  3    select '665988 (Notebook); 987654 (Mouse); 445577 (Dead Pixel);'    from dual;

2 rows created.

SQL>

Target table will contain values extracted from the source:

SQL> create table target (itn number, name varchar2(20));

Table created.

SQL> -- This value shouldn't be inserted as it already exists in the TARGET table:
SQL> insert into target values (234543, 'LED light');

1 row created.

SQL>

Now, something useful. The idea is to split column values to rows (that's what regexp_substr part in hierarchical query does, and then separate the ID value from the name (which is enclosed into brackets). Values that exist in the target table shouldn't be inserted (so query is supposed to insert 5 rows):

SQL> insert into target (itn, name)
  2  with
  3  c2r as
  4    -- split column to rows, e.g. "123456 (LED TV)" is an example of such a row
  5    (select to_char(trim(regexp_substr(col, '[^;]+', 1, column_value))) val
  6     from test join table(cast(multiset(select level from dual
  7                                        connect by level <= regexp_count(col, ';')
  8                                       ) as sys.odcinumberlist)) on 1 = 1
  9    ),
 10  sep as
 11    -- separate ITN (invoice tracking nubmer) and NAME
 12    (select substr(val, 1, instr(val, ' ') - 1) itn,
 13            substr(val, instr(val, ' ') + 1) name
 14     from c2r
 15    )
 16  select s.itn, replace(replace(s.name, '(', ''), ')', '')
 17  from sep s
 18  -- don't insert values that already exist in the TARGET table
 19  where not exists (select null from target t
 20                    where t.itn = s.itn
 21                   );

5 rows created.

SQL>

The final result:

SQL> select * From target;

       ITN NAME
---------- --------------------
    234543 LED light
    123456 LED TV
    654876 LED monitor
    665988 Notebook
    987654 Mouse
    445577 Dead Pixel

6 rows selected.

SQL>

Upvotes: 0

Related Questions