Reputation: 429
I have a lot of databases (+100) each one has the same structure and different connections. I'm using Kettle to run a transformation in the different databases in order to create a data-warehouse.
How can I automate the run of the same transformation with different connections?
I already prove this Pass DB Connection parameters to a Kettle a.k.a PDI table Input step dynamically from Excel but it only accepts a row in the csv.
Should I create a loop, or I'm going to need to create a script?
Any help would be appreciated. (Sorry for my english)
Upvotes: 1
Views: 1619
Reputation: 383
You can do it with a loop. But, do not fret, it's not hard to make that with Pentaho.
First of all, you will use a JOB to create your loop:
START --> Transform_that_holds_parameters --> Transform_to_run_in_a_loop
As you can guess, your transformation that runs equally on each DB is the last one on this flow. But we need to set two Advanced flags on that Job Entry:
Then we need to build our Transform_that_holds_parameters with the following structure:
Some_sort_of_input --> copy_rows_to_result
Here you will have to grab all connections parameters from somewhere, be it a Excel file or a table in another database. But once you grad this data, be sure to have 1 row for each database you want to run your transformation in. Ok?
Connect that to the 'Copy rows to result' step, this step sends the data back to our JOB and if you remember, our next transformation is set to 'Execute for every input row' and 'Copy previous result to parameters'.
Now, remember well what are the column names going to the last step of that transformation, you will need them on the next step.
Get back to our JOB and go to properties of the Transform_to_run_in_a_loop, open parameters and fill in the column 'parameter' and 'stream column name' with the columns we just copied to the Result.
Inside your transformation, you will need to set the same parameters with exactly the same names. And use these parameters on your connection settings.
Done, now you will have the first transformation setting all parameters and the second one running for each database config you have.
Upvotes: 2