piyer96
piyer96

Reputation: 11

pig regex to extract data between tags

My text file (input):

City,Description
Chicago,One day car rental is <b>$90</b>
Dallas,One day car rental is <b>$65</b>

Output needed:

City   Costofrental
Chicago, $90
Dallas,  $65

I am using regex extract to get the cost ($) details but not getting desired output. New to regex so please let me know what am i missing? TIA

A = LOAD '/user/Testfile.csv' USING PigStorage(',') AS(a1:chararray,a8:chararray); 
B = FOREACH A GENERATE a1,REGEX_EXTRACT(a8, '/<b>([0-9]*)</b>/',1);
dump B;

Upvotes: 1

Views: 114

Answers (1)

Aleksey Shein
Aleksey Shein

Reputation: 7482

You need to add escaped \$ to your regex (and escape closing </b> tag):

'/<b>(\$[0-9]*)<\/b>/'

Upvotes: 2

Related Questions