TheCodeCache
TheCodeCache

Reputation: 972

Calculating input splits in MapReduce

A file is stored in HDFS of size 260 MB whereas the HDFS default block size is 64 MB. Upon performing a map-reduce job against this file, I found the number of input splits it creates is only 4. how did it calculated.? where is the rest 4 MB.? Any input is much appreciated.

Upvotes: 1

Views: 1351

Answers (1)

Ronak Patel
Ronak Patel

Reputation: 3849

Input split is NOT always a block size. Input split is a logical representation of data. Your input split could have been 63mb, 67mb, 65mb, 65mb(or possibly other sizes based on logical records' sizes) ... see examples in below links...

Hadoop input split size vs block size

Another example - see section 3.3...

Upvotes: 1

Related Questions