kee
kee

Reputation: 11629

mapred.min.split.size

I am trying to experiment this parameter in MapReduce and I have some question.

Does this go by the size in HDFS (whether it is compressed or not)? Or is it after uncompression? I guess it is the former but just want to confirm.

Upvotes: 1

Views: 1200

Answers (2)

Pham An Khang
Pham An Khang

Reputation: 21

From Hadoop 0.21 I think the bz2 files are splittable. So you can use bz2.

Upvotes: 2

Chris White
Chris White

Reputation: 30089

This parameter will only be used if your input format supports splitting the input files. Common compression codecs (such as gzip) don't support splitting the files, so this will be ignored.

If the input format does support splitting, then this relates to the compressed size.

Upvotes: 2

Related Questions