Reputation: 11
I am new to pentaho and I have a requirement to read multiple SQL statements from a file and store the output in a file or datastore. Output/select columns are common for all the sql statements. May I know how to achieve it. E.g
Select last_name, first_name age from employee where age is > 25;
Select last_name, first_name age from employee where salary is > 25000;
Select last_name, first_name age from employee where deptid in (Select deptid from dept where deptid like 'Dept-E00%');
The data that I need to load is huge (25 million/day) and the above SQL statements are just samples. Selecting all the rows and applying filters on each row is not an option for me.
A .ktr file is much helpful.
Thanks in advice.
Upvotes: 1
Views: 3998
Reputation: 3968
You can also try as below:
In a separate transformation, define a variable and check : execute for every input row
and copy previous result to parameter
.
Finally, Take a "Table Input Step" and use that variable as defined above. Check the image below:
Here i have used ${queries}
as the variable.
Note: Assuming that the output step metadata structure will remain the same. If it is changing, i assume it won't work.
I assume this process will scale up for millions of record. I haven't tried it for this huge volume, but you can give it a try.
Here is the gist of the code as shown above.
You can read this blog also, in case you need to know more.
Hope it helps :)
Upvotes: 4
Reputation: 4544
You can run the SQL scripts by using the "Execute row SQL script". It'll run every script in sequential order and return the resultset of them all into a single data stream. I suggest you add a constant field to each of those SQL statements to include a description of what they refer to, as I'm not sure the Execute Row SQL script will include the SQL statement in the output stream.
Upvotes: 0