clairvoyant
clairvoyant

Reputation: 129

Replace value using value from other field

I have a problem that i need to replace string based on other field.

sample data:

field1|field2
1 x|x 
1 x|z
2 p|p
2 z|p

required output:

field1|field2
1|x
1 x|z
2|p
2 z|p

i have tried the following but doesn't work (no error simply doesn't do anything)

a = LOAD '$path'
                USING PigStorage('|')
                AS (
                field1:chararray,
                field2:chararray
                );  

b = foreach a generate
REPLACE(field1,field2,'') as field1,
field2;

any ideas?

thanks

Upvotes: 0

Views: 299

Answers (1)

Louise Miller
Louise Miller

Reputation: 3189

REPLACE takes a regular expression so you need to construct that regular expression using CONCAT.

So a rather cumbersome solution is :

B = FOREACH A GENERATE 
    REPLACE( field1, CONCAT ( CONCAT('(', field2), ')'), '')) as field1, field2;

so for the first row , the CONCAT will construct a regexp '(x)'

To get the output exactly like above I added a TRIM

B = FOREACH A GENERATE TRIM(REPLACE( field1, CONCAT ( CONCAT('(', field2), ')'), '')) as field1, field2;

A better solution would be to write your own udf - then your pig code would be more readable

e.g

B = FOREACH A GENERATE MyReplaceWithBlank( field1, field2) as field1, field2;

see https://pig.apache.org/docs/r0.12.1/udf.html#eval-functions

Upvotes: 2

Related Questions