DUnkn0wn1
DUnkn0wn1

Reputation: 411

Pentaho Data Integration (PD)I: After Selecting records I need to update the field value in the table using pentaho transforamtion

Have a requirement to create a transformation where I have to run a select statement. After selecting the values it should update the status, so it doesn't process the same record again.

Select file_id, location, name, status

from files

OUTPUT:

1, c/user/, abc, PROCESS

Updated output should be:

1, c/user/, abc, INPROCESS

Is it possible for me to do a database select and cache the records so it doesn't reprocess the same record again in a single transformation in PDI? So I don't need to update the status in the database. Something similar to dynamic lookup in Informatica. If not what's the best possible way to update the database after doing the select.

enter image description here

Upvotes: 0

Views: 795

Answers (1)

Brian.D.Myers
Brian.D.Myers

Reputation: 2518

Thanks, that helps. You wouldn't do this in a single transformation, because of the multi-threaded execution model of PDI transformations; you can't count on a variable being set until the transform ends.

The way to do it is to put two transformations in a Job, and create a variable in the job. The first transform runs your select and flows the result into a Set Variables step. Configure it to set the variable you created in your Job. Next you run the second transform which contains your Excel Input step. Specify your Job level variable as the file name.

If the select gives more than one result, you can store the file names in the Jobs file results area. You do this with an Set files in result step. Then you can configure the job to run the second transform once for each result file.

Upvotes: 1

Related Questions