Reputation:
I have the ETL process written using the Kettle. It performs data transfer from the operational data source (MS SQL on Windows) to the Data Warehouse (MySQL on Ubuntu).
I want to schedule the Kettle job(other) for daily execution for populating tables of dimensions and table of fact, to have my data actual and ready for analysis and reporting.
How can I schedule performing the Kettle jobs?
Upvotes: 2
Views: 6294
Reputation: 3294
scheduling in pentaho is done by carte server. http://wiki.pentaho.com/display/EAI/Carte+User+Documentation
using with your start step params scheduling and the carte server you will be able to schedule this kettle job when you want.
Upvotes: 1
Reputation: 6023
in your kettle installation directory are several batch files, among them spoon.bat
, pan.bat
and kitchen.bat
. Spoon
is the UI you already know, pan
is a command line tool to run transformations (.ktr
files) from the command line and kitchen
is a command line tool to run kettle jobs (.kjb
files).
for a simple schedule create a batch file that calls either kitchen.bat or pan.bat (depending on whether you want to run a transformation or a job). Then use the windows task scheduler to run your batch file with whichever schedule you want.
this for instance would run a kettle job, use basic logging and append the log content to a logfile
kitchen.bat /file:"c:\etl\my_first_job.kjb" /level:Basic > c:\etl\logs\logging_for_my_first_job.log
this is of course for windows. If you run kettle on linux, you can use cron and the respective .sh files in the kettle installation directory (pan.sh
or kitchen.sh
).
as kettle stores shared database connections in the user profile, make sure the user running the scheduled task has those connections in his profile, otherwise your transformations would fail.
Upvotes: 3