Reputation: 174
I am researching standard sample from Pentaho DI package: GetXMLData - Read parent children rows
. It reads separately from same XML input parent
rows & children
rows. I need to do the same and update two different sheets of the same MS Excel Documents.
My understanding is that normal way to achieve it is to put first sequence in one transformation file with XML Output or Writer, second to the second one & at the end create job with chain from start, through 1st & 2nd transformations.
My problems are:
KJB
job + 2 KTR
transformation files).Questions are:
wait
node before starting update 2nd Excel sheet? =================
UPDATE:
Based on @AlainD proposal I have tried to put Block
node in-between. Here is a result:
Looks like Block
step can be an option, but somehow it doesn't work as expected with Excel Output / Writers
node (or I do something wrong). What I have observed is that Pentaho tries to execute next after Block
steps before Excel file is closed properly by the previous step. That leads to one of the following: I either get Excel file with one empty sheet or generated result file is malformed.
My input XML file (from Pentaho distribution) & test playground transformation are: HERE
NOTE: While playing do not forget to remove generated MS Excel files between runs.
Any suggestions how to fix my transformation?
Upvotes: 0
Views: 2299
Reputation: 6356
The pattern goes as follow:
It is a pattern, you may want to change the flow, and/or sort to speed up. But it will not lock, nor feed up the memory: the group by
and lookup
are pretty reliable.
Upvotes: 1
Reputation: 6356
Question 1: Yes, the step you are looking after is named Block until this (other) step finishes
, or Blocking Step (untill all rows are processed)
.
Question 2: Yes, you can pass the rows from one transformation to an other via the job. But it would be wiser to first produce the parent sheet and, when finished, read it again in the second transformation. You can also pass the row in a sub-transformation, or use other architecture strategies...
Question 3: (Short answer) The Excel Writer
appends data (new sheet or new rows) to an existing Excel file, while the Excel Output
creates and feed a one sheet Excel file.
Upvotes: 0