Reputation: 118
I'm trying to implement a conditional foreach generate where one of the columns data changes depending on the input data.
Say for example I have this data in alias A:
dump A;
(George, Films)
(Martin, Books)
I want to store an Y if the name starts with G. From the documentation I know there is a conditional arithmetic operation but I cannot find the way to do the 'starts with X' thing. I think it should be something like this where ##### is the missing condition.
B = FOREACH A GENERATE (##### ? "Y":"N");
Upvotes: 1
Views: 887
Reputation: 561
You're looking for the SUBSTRING function. Use it like this:
b = foreach a generate $0.., (SUBSTRING($0,0,1)=='G'?'y':'n');
Read more about it here
https://pig.apache.org/docs/r0.9.1/func.html#substring
it would give you
(George,Films,y)
(Martin,Books,n)
Upvotes: 1
Reputation: 5551
You can apply UDFs within GENERATE
:
B = FOREACH A GENERATE MyUdf(name);
Where MyUdf
is a function you write to perform the logic you want. I'm not aware of a way to do this without UDFs.
Upvotes: 0