math_law
math_law

Reputation: 391

Apache Pig convert rows to a single column delimited with character

I need convert Value columns to a single row grouped by the City and delimited with the "|" (pipe) character

DATA = LOAD '/tmp/test.dat' Using PigStorage(',') as ( CITY:chararray, VALUE:chararray )

Input:(City/Value)

ISTANBUL,1

ISTANBUL,2

ISTANBUL,3

NEWYORK,8

NEWYORK,9

Output:

ISTANBUL,1|2|3

NEWYORK,8|9

Upvotes: 0

Views: 172

Answers (1)

LiMuBei
LiMuBei

Reputation: 3078

First do a group by on CITY, then use BagToString (http://pig.apache.org/docs/r0.15.0/func.html#bagtostring) to convert the values for each group into the required string representation. Something like (untested!)

data = LOAD '/tmp/test.dat' using PigStorage(',') AS (city:chararray, value:chararray);
data_grp = GROUP data BY city;
result = FOREACH data_grp GENERATE group AS city, BagToString(data.value, '|') AS values;

Upvotes: 2

Related Questions