Reputation: 18279
I resort to the following:
A = LOAD 'a.txt' USING PigStorage('\\u001') AS (
foo:int
,bar:chararray
);
B = LOAD 'b.txt' USING PigStorage('\\u001') AS (
foo:int
,baz:long
);
C = JOIN A BY foo, B BY foo;
D = FOREACH C GENERATE
A::foo AS foo
,A::bar AS bar
,B::baz AS baz
;
How can I join and define the schema in a single step?
Upvotes: 0
Views: 213
Reputation: 10650
According to the documentation you can't define a schema when joining relations.
Note:
Syntactically you can nest commands to have the feeling that you saved some steps like:
D = foreach
(join (LOAD 'a.txt' USING PigStorage('\\u001') AS (foo:int ,bar:chararray)) by foo,
(LOAD 'b.txt' USING PigStorage('\\u001') AS (foo:int ,baz:long)) by foo
) generate $0 as foo, $1 as bar, $3 as baz;
But I'd avoid doing so. It's chaotic and nonetheless it generates the same explain plan as the original one.
Upvotes: 3