Reputation: 129
I have a problem that i need to replace string based on other field.
sample data:
field1|field2
1 x|x
1 x|z
2 p|p
2 z|p
required output:
field1|field2
1|x
1 x|z
2|p
2 z|p
i have tried the following but doesn't work (no error simply doesn't do anything)
a = LOAD '$path'
USING PigStorage('|')
AS (
field1:chararray,
field2:chararray
);
b = foreach a generate
REPLACE(field1,field2,'') as field1,
field2;
any ideas?
thanks
Upvotes: 0
Views: 299
Reputation: 3189
REPLACE
takes a regular expression so you need to construct that regular expression using CONCAT
.
So a rather cumbersome solution is :
B = FOREACH A GENERATE
REPLACE( field1, CONCAT ( CONCAT('(', field2), ')'), '')) as field1, field2;
so for the first row , the CONCAT will construct a regexp '(x)'
To get the output exactly like above I added a TRIM
B = FOREACH A GENERATE TRIM(REPLACE( field1, CONCAT ( CONCAT('(', field2), ')'), '')) as field1, field2;
A better solution would be to write your own udf - then your pig code would be more readable
e.g
B = FOREACH A GENERATE MyReplaceWithBlank( field1, field2) as field1, field2;
see https://pig.apache.org/docs/r0.12.1/udf.html#eval-functions
Upvotes: 2