Shri
Shri

Reputation: 485

two store function in one pig script

I have a pig script to load , process and store the data.

If there are two store functions in the same single pig script, how it works?

a = load 'somefile' using PigStorage(',');
b ...
c ...
d ...
e = store d into 'output1';
f = store c into 'output2';

Does this run two times for each store. That is for store 'e' it process from 'a' to 'e' and for store 'f' it directly store 'c' since it is already processed or it will again start from 'a' ?

Upvotes: 1

Views: 709

Answers (1)

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25909

Generally speaking the underlying map/reduce framework has a multiple-output format so Pig can use that and run a two store script in a single job e.g. by having separate reduces and each would write to another file

However the actual map/reduce plan depends on what you do to get to c and d - sometimes that processing would require more than a single job - to understand how your script behave you can use Pig's explain command. If you want a graphical visualization you can use Netflix's lipstick

Upvotes: 2

Related Questions