Reputation: 2454
My question applies to ETL scenarios, where transformation is performed outside of database (completely). If you were to Extract, Transform, and Load huge volumes of data (20+ million records or more) and the databases involved are : Oracle and MSSQL Server, what would be the best way to:
Appreciate your thoughts/suggestions.
Upvotes: 1
Views: 3048
Reputation: 32392
I would always use the db's bulk load facilities for this. Remote control of bulk loads is a sysadmin issue; there is always a way to do this.
This means that the first two stages of ETL would be and application that generates the right file format for the bulk load facility, and the last stage would be to invoke bulk loading.
Upvotes: 1
Reputation: 41
If you extract the data to a flat file, you can use Oracle External Tables to read the file into oracle directly. At that point you can do a direct-path insert (/*+ append */) that performs the necessary transformation. Also, if the table is nologging the you will save on redo costs, but need to take into the account that the data could be lost / need to be reloaded in the case of a media failure before the next backup.
Also, if the data is time-based, consider using partitioning and partition exchanges. You can load the data via the method above into a index-less staging table, create the necessary indexes (again possibly as NOLOGGING) which is faster than maintaining the indexes during the insert, then 'exchange' the partition into the master table.
Upvotes: 0