Reputation: 15
I have a txt file which has the following format:
{ (word1),(word2),(word3),....,(wordn) }
The words are NOT in quotes. I would like to use apache pig and change the format of this file simply to:
word1
word2
word3
wordn
Is there any way to do so with apache pig?
Upvotes: 1
Views: 259
Reputation: 4724
Can you try this?
input
{ (word1),(word2),(word3),(wordn) }
PigScript1:
A = LOAD 'input' AS (mybag:{T:(line:chararray)});
B = FOREACH A GENERATE REPLACE(BagToString(mybag.line),'_',' ');
STORE B INTO 'output';
Output:(stored in output/part* file)
word1 word2 word3 wordn
Update:(Incase if you want all the columns in a single row then use Flatten operator)
PigScript2:
A = LOAD 'input' AS (mybag:{T:(line:chararray)});
B = FOREACH A GENERATE FLATTEN(mybag);
STORE B INTO 'output1';
Output:
word1
word2
word3
wordn
Upvotes: 0