Reputation: 665
I have a big table, that is generated in Hue with Pig Editor and contains some hundred thousand records. Pig returns some part files and separately .pig_header and .pig_schema files. I need to have all the part files and a header as one complete table in .txt format. I can do it with getmerge command:
-- To delete schema from output folder
fs -rm /OUTPUT_folder/.pig_schema
--To merge all the part files and header from output folder and to save result in .txt file
fs -getmerge /OUTPUT_folder/* /Another_folder/Result.txt
I would like to ask if there is any way in Cloudera to get this complete table without using getmerge command?
Maybe there is a software in Cloudera or command that allows to combine part files at once.
And then i just need to open this table having, all the columns with headers in a ''nice- ordered way'', what is better to use for this goal in hue?
Upvotes: 0
Views: 813
Reputation: 7082
You could try to do a final GROUP BY ALL and a ORDER BY follow by a FOREACH FLATTEN() that way all the records will go into a single reducers and so will be in only one file.
Upvotes: 0