Javier Cabero
Javier Cabero

Reputation: 118

Apache Pig conditional foreach generate

I'm trying to implement a conditional foreach generate where one of the columns data changes depending on the input data.

Say for example I have this data in alias A:

dump A;
(George, Films)
(Martin, Books)

I want to store an Y if the name starts with G. From the documentation I know there is a conditional arithmetic operation but I cannot find the way to do the 'starts with X' thing. I think it should be something like this where ##### is the missing condition.

B = FOREACH A GENERATE (##### ? "Y":"N");

Upvotes: 1

Views: 887

Answers (2)

Ran Locar
Ran Locar

Reputation: 561

You're looking for the SUBSTRING function. Use it like this:

b = foreach a generate $0.., (SUBSTRING($0,0,1)=='G'?'y':'n');

Read more about it here

https://pig.apache.org/docs/r0.9.1/func.html#substring

it would give you

(George,Films,y)
(Martin,Books,n)

Upvotes: 1

Ben Watson
Ben Watson

Reputation: 5551

You can apply UDFs within GENERATE:

B = FOREACH A GENERATE MyUdf(name);

Where MyUdf is a function you write to perform the logic you want. I'm not aware of a way to do this without UDFs.

Upvotes: 0

Related Questions