Reputation: 31
We have around 5000 tables in Oracle and the same 5000 tables exist in SQL server. Each table's columns vary frequently but at any point in time source and destination columns will always be the same. Creating 5000 Data flow tasks is a big pain. Further there's a need to map every time a table definition changes, such as when a column is added or removed.
Tried the SSMA (SQL Server Migration Assistance for Oracle ) but it is very slow for transferring huge amount of data then moved to SSIS
I have followed the below approach in SSIS:
I have created a staging table where it will have a table name, source query (oracle), Target Query (SQL server) used that table in Execute SQL task and stored the result set as the full result set
created for each loop container off that execute SQL task result set and with the object and 3 variables table name, source query and destination query
In the data flow task source I have chosen OLE DB source for oracle connection and choose data access mode as an SQL command from a variable (passed source query from loop mapping variable)
In the data flow task destination I have chosen OLE DB source for SQL connection and choose data access mode as an SQL command from a variable (passed Target query from loop mapping variable)
And looping it for all the 5000 tables..it is not working can you please guide us how I need to create it for 5000 tables dynamically from oracle to SQL server using SSIS. any sample code/help would be greatly appreciated. Thanks in advance
Upvotes: 3
Views: 7481
Reputation: 1
We had similar requirements on a project wherein we wanted to use SSIS for tables that could be created dynamically on a daily basis.
At first we developed package generation code using the programmable SSIS object model, something that was rather difficult given the limited amount of documentation/examples available. I guess this is what BIML does, but in any case we got it working and then management said that the package generation code would be too difficult to maintain (even though it was well structured, concise C# code).
After exhaustive reading, we found that the following strategy achieved what the questioner would like to do:
1. Create a standard recipe for loading target tables, in our case staging and reporting tables in a particular data mart:
a. Truncate target staging table;
b. Load target staging table using data from the transactional DB;
c. Merge staging table data into reporting table in data mart.
2. Hand-code a Single SSIS Package to Load any Staging Table Using the Recipe:
a. Define package variables:
i. Truncate Staging Table SQL -- a call to a generated stored procedure;
ii. Load Staging Table SQL -- a call to a generated table function;
iii. Target Staging Table Schema, Name;
iv. Merge Reporting Table SQL -- a call to a generated stored procedure.
b. Implement the Control Flow:
i. Truncate Staging Table SQL;
ii. Data Flow Task (addressed in step 'c', below);
iii. Reporting Table Merge SQL;
c. Code a General Purpose "Script Component" as the Data Flow:
i. Note: this is the workaround for the rigidity of SSIS data flows needing to know metadata at design time. The trick is to write general purpose Script Component code, in C# in our case, to load any staging/reporting table pair according to the standard recipe. Our target reporting data marts were hosted in a SQL Server database, so we could build on this assumption and write SQL Server bulk load logic.
ii. Set Script Component Input Variables:
A. Load Staging Table SQL -- a call to the table function;
B. Target Table Name;
iii. Set Script Component Connections:
A. Source Database Connection;
B. Target Database Connection;
iv. Write General Script Component Code: when you click the "Edit Script ..." button on the Script Component's "Script Transformation Editor's 'Script' Pane", SSIS loads another instance of Visual Studio to edit the Script Component's code. In that new editor, you will find that the names you gave to the Script Component's input variables and connections can be accessed as C# variables, providing a measure of compile time convenience.
We wrote all of our code in the Script Component's PreExecute() method. Our code created SqlConnection objects for the source and target databases, a SqlDataAdapter and DataTable for the input data queried via the table function, and used a SqlBulkCopy object to copy the queried data into the target staging table in the appropriate data mart.
When used in this manner, the Script Component effectively allows one to use SSIS to receive its data flow metadata at runtime.
We expanded upon this strategy to create one hand-coded SSIS package per target reporting database technology as we needed to support reporting databases hosted on different RDBMS platforms. We also plan to use this strategy to allow for ETL logic implemented in languages other than C# and VB wherein a Script Component can be configured and written to call into code written in another programming language, e.g., PowerShell or Python (IronPython).
Upvotes: 0
Reputation: 37313
Using SSIS, when thinking about dynamic source or destination you have to take into consideration that the only case you can do that is when metadata is well defined at run-time. In your case:
Each table columns vary frequently but at any point of time source destination columns will always same.
You have to think about build packages programatically rather than looping over tables.
Yes, you can use loops in case you can classify tables into groups based on their metadata (columns names, data types ...). Then you can create a package for each group.
If you are familiar with C# you can dynamically import tables without the need of SSIS. You can refer to the following project to learn more about reading from oracle and import to SQL using C#:
I will provide some links that you can refer to for more information about creating packages programatically and dynamic columns mapping:
Upvotes: 0