Reputation: 4483
I have a relation MY_REL that is the result of a join of X and Y:
MY_REL = {X::x1,X::x2,Y::y1,Y::y2}
And I tried to do
Bla = foreach MY_REL generate X;
Pig vomited:
ERROR 1000: Error during parsing. Scalars can be only used with projections
I tried X::* and it throws: invalid alias X.
The ugly workaround: I switched to explicitly writing all column names:
Bla = foreach MY_REL generate X::x1, X::x2;
Is there a nice way to generate all X's columns?
Upvotes: 2
Views: 3154
Reputation: 7082
If your columns before the JOIN all have a different name, you can just use them as is later:
Bla = foreach MY_REL generate x1 + y1, x2 + y2;
If only one conflict you need to use the original prefix relation
Bla = foreach MY_REL generate x1, x2, y1, Y:y1 AS y2;
An with the new Pig range PIG-1693
Bla = foreach MY_REL generate ..$2, $3 AS y2;
And there is also some talks in PIG-2511
Upvotes: 1
Reputation: 39893
Instead of using JOIN
, use COGROUP
. COGROUP
will create a relation that looks like {X : {x1, x2}, Y : {y1, y2}}
. Therefore, you can do:
foreach MY_REL GENERATE FLATTEN(X);
Note that it is a bag in there, so you want to flatten
it.
Upvotes: 3