Akiee
Akiee

Reputation: 1

Have to avoid all records having more than one occurrence using control file of a SQL Loader

I want to avoid all the records to be inserted into a table using SQL Loader. The table in which we are inserting has a primary key over 3 columns. Whenever duplicate occurred, code gets failed with a primary key violation.

I have a requirement to skip/avoid all the record having more than one occurrence in the data file while inserting using control file. Could anyone please help me out here?

Is there any way to skip those records by adding some conditions in control file itself?

Upvotes: 0

Views: 594

Answers (1)

Littlefoot
Littlefoot

Reputation: 142968

To me, it seems that you're looking for a problem, while there's no problem at all.

  • there's a table
  • it has a primary key constraint set
  • there's a file with data, some of them are duplicates (by the meaning of primary key columns)
  • loading session begins
  • rows that satisfy all conditions are successfully loaded into the table
  • rows that violate anything (primary key constraint included) are not loaded
    • it (the error) is recorded in the .log file
    • those rows are now in the .bad file
  • once the loading is done, table contains only distinct rows

Just in case you didn't know, there's the errors parameter; its value is integer, and it shows how many errors you want to allow:

  • default is 50
  • if you don't want to allow any errors, set it to 0
  • if you want to allow any number of errors, set it to a very high value

So, what is the problem? I don't see any.


As it seems that - in case of duplicates - you don't want to load any of those rows as you don't know which one is correct, I don't think that SQL Loader is capable of doing that.

One option is to switch to external table which has the input file as a source and acts as an "ordinary" table and lets you write SELECT statements directly, without previously loading data into the database. It enables you to skip duplicates, e.g.

insert into target_table (col1, col2, ...)
  select col1, col2, ...
  from external_table a
  where (a.pkcol1, a.pkcol2, pkcol3) not in 
        (select b.pkcol1, b.pkcol2, b.pkcol3
         from external_table b
         group by b.pkcol1, b.pkcol2, b.pkcol3
         having count(*) > 1
        );

Or, you could disable primary key constraint, load data as you do it now (including duplicates), and then delete all duplicates, e.g.

delete from target_table a
      where (a.pkcol1, a.pkcol2, pkcol3) in 
            (select b.pkcol1, b.pkcol2, b.pkcol3
             from target_table b
             group by b.pkcol1, b.pkcol2, b.pkcol3
             having count(*) > 1
            );

Then re-enable the primary key constraint.

Upvotes: 1

Related Questions