process small file map reduce hadoop

Question

I have a 456kb file which is being read from hdfs and its given as input to mapper function. Every line contain a integer for which I am downloading some files and storing them on local system. I have hadoop set up on two-node cluster and the split size is changed from the program to open 8-mappers :

    Configuration configuration = new Configuration();

    configuration.setLong("mapred.max.split.size", 60000L);
    configuration.setLong("mapred.min.split.size", 60000L);

8 mappers are created but same data is downloaded on both the servers, I think its happening because block size is still set to default 256mb and input file is processed twice. So my question is can we process a small size file with map reduce?

process small file map reduce hadoop

Answers (1)

Related Questions