Augustine
Augustine

Reputation: 11

How to read multiple SQL statements from a file and execute in pentaho

I am new to pentaho and I have a requirement to read multiple SQL statements from a file and store the output in a file or datastore. Output/select columns are common for all the sql statements. May I know how to achieve it. E.g

Select last_name, first_name age from employee where age is > 25;
Select last_name, first_name age from employee where salary is > 25000;
Select last_name, first_name age from employee where deptid in (Select deptid from dept where deptid like 'Dept-E00%');

The data that I need to load is huge (25 million/day) and the above SQL statements are just samples. Selecting all the rows and applying filters on each row is not an option for me.

A .ktr file is much helpful.

Thanks in advice.

Upvotes: 1

Views: 3998

Answers (2)

Rishu S
Rishu S

Reputation: 3968

You can also try as below:

  1. Read all the SQL queries into a memory using the "Copy rows to result". Check the image as below:

enter image description here

  1. In a separate transformation, define a variable and check : execute for every input row and copy previous result to parameter.

  2. Finally, Take a "Table Input Step" and use that variable as defined above. Check the image below:

enter image description here

Here i have used ${queries} as the variable.

Note: Assuming that the output step metadata structure will remain the same. If it is changing, i assume it won't work.

I assume this process will scale up for millions of record. I haven't tried it for this huge volume, but you can give it a try.

Here is the gist of the code as shown above.

You can read this blog also, in case you need to know more.

Hope it helps :)

Upvotes: 4

nsousa
nsousa

Reputation: 4544

You can run the SQL scripts by using the "Execute row SQL script". It'll run every script in sequential order and return the resultset of them all into a single data stream. I suggest you add a constant field to each of those SQL statements to include a description of what they refer to, as I'm not sure the Execute Row SQL script will include the SQL statement in the output stream.

Upvotes: 0

Related Questions