Reputation: 10291
Wondered if anyone else had come across this problem, and how it is solved.
My Pig script "needs" to output as XML. The main body builds up XML as follows:
<Item><Val1>abc</Val1><Val2>qwe</Val2></Item>
<Item><Val1>tre</Val1><Val2>bnm</Val2></Item>
The problem with this is it isn't valid XML. I need to wrap this like:
<Items>
<Item>...</Item>
</Items>
But how can this be done in Pig/Hadoop? The output files are split out across multiple part-XXXXX files, so this can only be done on the merge.
Or maybe XML is completely the wrong approach, and it's always JSON!
Thanks
Duncan
Upvotes: 0
Views: 749
Reputation: 3805
Here's one possible solution. You could do a GROUP ALL
immediately before your STORE
to ensure only one part-XXXXX
file is output, this would let you wrap your entire XML block with the desired <Items>
tag.
Upvotes: 1