Reputation: 60004
In Pig, when I do left join and a row does not have row, the values are NULL
:
c = join a by ($0) left, b by ($0);
if
a=((1,10),(2,20))
b=((1,30))
then
c=((1,10,30),(2,20,NULL))
I want to use a default value (say, -1
) instead of NULL
so that
c=((1,10,30),(2,20,-1))
How do I do that?
If that is impossible, how do I change the 3rd column of c
to have the default value instead of NULL
?
Upvotes: 3
Views: 3372
Reputation: 3273
I am not aware if that can be done within the join statement, but you add add another statement:
d = FOREACH c GENERATE $0, $1, (($2 IS NULL) ? -1 : $2);
I guess it won't trigger an additional MR job.
Upvotes: 6