Reputation: 519
A client site has supplied the following extract file for us to load into our database.
The problem is, for certain rows (the second row for example) the CREATED_DATE and the LAST_UPDATE_DATE are in "dd Mmm YYYY..." date format when the rest the rows (such as the top) are in the format of "YYYY-MM-DD HH24.MI.SSXFF"
PRIMARY_ID ID VALUE CREATED_DATE LAST_UPDATE_DATE
20166267 20834830491 2012-04-30 08:18:00 2012-04-30 08:18:00
20166536 9112 01 Oct 2010 17:27:04 01 Oct 2010 17:27:04
My questions are: Q1. To avoid having to request an extract, can we manipulate these “dd Mmm YYYY...” formatted dates at import time in SQL Loader using the .ctl script? CUrrently my .ctl is
My .ctl file is scripted to import using:
IDENTIFIER_START_DATE TIMESTAMP "YYYY-MM-DD HH24.MI.SSXFF",
LAST_UPDATE_DATE TIMESTAMP "YYYY-MM-DD HH24.MI.SSXFF"
Q2. Is simply asking them for a re-extract with all date formats as requested the best practice in situations like this?
Upvotes: 0
Views: 4626
Reputation: 23757
Use this .ctl script:
load data
append
into table schema_name.table_name
fields terminated by ';' optionally enclosed by '"'
(
PRIMARY_ID,
ID_VALUE,
CREATED_DATE "to_date(:CREATED_DATE, case when regexp_substr(:CREATED_DATE,'\w+',1,2)=regexp_substr(:CREATED_DATE,'\d+',1,2) then 'YYYY-MM-DD HH24:MI:SS' else 'dd Mon YYYY HH24:MI:SS' end)",
LAST_UPDATE_DATE "to_date(:LAST_UPDATE_DATE, case when regexp_substr(:LAST_UPDATE_DATE,'\w+',1,2)=regexp_substr(:LAST_UPDATE_DATE,'\d+',1,2) then 'YYYY-MM-DD HH24:MI:SS' else 'dd Mon YYYY HH24:MI:SS' end)"
)
Upvotes: 1
Reputation: 231781
Whether to request a re-extract of the data depends on a number of factors.
As to how to load the data, while it is possible to do this in a single step, you generally wouldn't want to. It would generally make more sense to load the data into a staging table (or use an external table) where all the columns are declared as VARCHAR2
and then write some ETL logic that transforms the data into the appropriate data types (and logs errors for the data that cannot be converted). For example, if you loaded the data into a staging table where all the columns were defined as VARCHAR2
, you could use something like the my_to_date function in this thread to try a number of different format masks to find one that works (if there are a lot of possible masks, you may want to iterate through a collection rather than hard-coding the two masks as I did in that example).
One additional point... An Oracle DATE
stores the time to the second which appears to be the precision of the data you're being given. It would, therefore, seem to make more sense to load the data into a DATE
column rather than a TIMESTAMP
column.
Upvotes: 2