James
James

Reputation: 572

Loading large amounts of data to an Oracle SQL Database

I was wondering if anyone had any experience with what I am about to embark on. I have several csv files which are all around a GB or so in size and I need to load them into a an oracle database. While most of my work after loading will be read-only I will need to load updates from time to time. Basically I just need a good tool for loading several rows of data at a time up to my db.

Here is what I have found so far:

  1. I could use SQL Loader t do a lot of the work

  2. I could use Bulk-Insert commands

  3. Some sort of batch insert.

Using prepared statement somehow might be a good idea. I guess I was wondering what everyone thinks is the fastest way to get this insert done. Any tips?

Upvotes: 3

Views: 9314

Answers (3)

Stellios
Stellios

Reputation: 742

You may be able to create an external table on the CSV files and load them in by SELECTing from the external table into another table. Whether this method will be quicker not sure however might be quicker in terms of messing around getting sql*loader to work especially when you have a criteria for UPDATEs.

Upvotes: 0

DCookie
DCookie

Reputation: 43523

I would be very surprised if you could roll your own utility that will outperform SQL*Loader Direct Path Loads. Oracle built this utility for exactly this purpose - the likelihood of building something more efficient is practically nil. There is also the Parallel Direct Path Load, which allows you to have multiple direct path load processes running concurrently.

From the manual:

Instead of filling a bind array buffer and passing it to the Oracle database with a SQL INSERT statement, a direct path load uses the direct path API to pass the data to be loaded to the load engine in the server. The load engine builds a column array structure from the data passed to it.

The direct path load engine uses the column array structure to format Oracle data blocks and build index keys. The newly formatted database blocks are written directly to the database (multiple blocks per I/O request using asynchronous writes if the host platform supports asynchronous I/O).

Internally, multiple buffers are used for the formatted blocks. While one buffer is being filled, one or more buffers are being written if asynchronous I/O is available on the host platform. Overlapping computation with I/O increases load performance.

There are cases where Direct Path Load cannot be used.

Upvotes: 5

jim mcnamara
jim mcnamara

Reputation: 16379

With that amount of data, you'd better be sure of your backing store - the dbf disks' free space.

sqlldr is script drive, very efficient, generally more efficient than a sql script. The only thing I wonder about is the magnitude of the data. I personally would consider several to many sqlldr processes and assign each one a subset of data and let the processes run in parallel.

You said you wanted to load a few records at a time? That may take a lot longer than you think. Did you mean a few files at a time?

Upvotes: 0

Related Questions