Reputation: 9828
I need to join multiple tables. The command that I am using is as follows:
G = JOIN aa BY f, bb by f, cc by f, dd by f;
To make it a full outer join, I added a FULL
to make it:
G = JOIN aa BY f FULL, bb by f, cc by f, dd by f;
But it gives me a mismatched input
error message. How should I make this work?
Thanks!
Upvotes: 2
Views: 10360
Reputation: 935
You can use COGROUP statement to imitate full outer join. For example cogroup on using following two files
Decimal.csv
first|1
second|2
fourth|4
Roman.csv
first|I
second|II
third|III
Pig commands:
english = LOAD 'Decimal.csv' using PigStorage('|') as (name:chararray,value:chararray);
roman = LOAD 'Roman.csv' using PigStorage('|') as (name:chararray, value:chararray);
multi = cogroup english by name, roman by name;
dump multi
Output:
(first,{(first,1)},{(first,I)})
(third,{},{(third,III)})
(fourth,{(fourth,4)},{})
(second,{(second,2)},{(second,II)})
Upvotes: 1
Reputation: 10650
According to the Pig documentation :
Outer joins will only work for two-way joins; to perform a multi-way outer join, you will need to perform multiple two-way outer join statements.
Upvotes: 7