Compress output of Hadoop Archive tool

Question

I'm using Hadoop Archive for reduce number of files in my Hadoop cluster, but for data retention, I want to keep my data as long as possible. Then the problem is Hadoop Archive not reduce folder size (my folder have multi-type of file, both small and large file, then not suitable for use Sequence File).

I used some option like -D mapreduce.compress.map.output=true -D mapred.map.ouput.compress.codec=org.apache.hadoop.io.compress.GzipCodec but it's not work.

Does anyone know a way for compress output of Hadoop Archive, or suggest me someway to get both goal (compress size and reduce number of file).

Any infomation is appreciate. Thanks so much.

Compress output of Hadoop Archive tool

Answers (1)

Related Questions