Reputation: 485
I have a pig script to load , process and store the data.
If there are two store functions in the same single pig script, how it works?
a = load 'somefile' using PigStorage(',');
b ...
c ...
d ...
e = store d into 'output1';
f = store c into 'output2';
Does this run two times for each store. That is for store 'e' it process from 'a' to 'e' and for store 'f' it directly store 'c' since it is already processed or it will again start from 'a' ?
Upvotes: 1
Views: 709
Reputation: 25909
Generally speaking the underlying map/reduce framework has a multiple-output format so Pig can use that and run a two store script in a single job e.g. by having separate reduces and each would write to another file
However the actual map/reduce plan depends on what you do to get to c and d - sometimes that processing would require more than a single job - to understand how your script behave you can use Pig's explain command. If you want a graphical visualization you can use Netflix's lipstick
Upvotes: 2