Reputation: 129
I'm quite new to coding in general and I'm looking to copy 47 columns with c300,000 rows of data, from an Oracle to an SQL database, on a daily basis. The code will be stored as a Windows Service, running at the same time every day (or more likely night).
The data from the Oracle DB table (let's call this the Oracle_Source
) will be used to both append to a history table (call this SQL_History
) and also to append new/update matching/delete missing rows from a live table (call this SQL_Live
). The two types of databases are housed on different servers, but the two SQL tables are on the same DB.
I have a few questions around the best way to approach this.
Using VB/C#, is it faster to loop through rows (either 1 by 1 or batches of 100/1000/etc.) of Oracle_Source
and insert/update SQL_History
/SQL_Live
OR copy the entire table of Oracle_Source
in one go and insert into the SQL tables? Previously I have used the loop to download data into a .csv
.
Using the more efficient of the above methods, would it be faster to work on both SQL tables simultaneously OR copy the data into the SQL_History
table and then use that to APPEND
/UPDATE
/DELETE
from the SQL_Live
table?
Am I approaching this completely wrong?
Any other advice available is also much appreciated.
Upvotes: 2
Views: 1208
Reputation: 383
The correct question is “What is the fast way to copy the table?” In your specific case , with 2 different server and a “big” table to copy, you are probably limited by network IO.
So, the first point is to update only the rows that must be update (Update/ Insert / Delete), so less byte to move.
To answer to your first point, you have to use transaction to improve the speed on sql server during the writing phase. The dimension of transaction depend on differenct factor (db, machine, ...) but I usually make transaction with 500/1000 simple commands. In my personal experience, if you use INSERT with more rows, you can send 500 rows for INSERT without performance issue.
In my experience, a bulk copy is faster than an efficient INSERT, UPDATE and DELETE because the db does not calculate key and does not check duplicate rows.
Better explained:
This is the faster way to copy a table but if your communication is from different server with low network speed this can't be the best choice. Obviously, what is the best choice depend from your infrastructure and the table dimension
For example:
Upvotes: 2