Sorting in hadoop framework

Question

I have tried implementing secondary sort. so i have a question related to that :

Sorting happens 3 times in Hadoop framework 

 1) Sorting in Buffer ( Sorting occur based on key of a map function)
 2) Sorting during merging of spill files of mapper output( ?????????????)
 3) Sorting at Reducer side when reducer gets map output from various mapper based on partition logic again merging happens .( Sorting occur based on Sort Comparator )

if my above understanding is correct, Then based on what logic sorting occurs during spill files merging on map output files ,it it based on keys that we use in map function or sort comparator on which reduce side sorting happen and why ?

Sorting in hadoop framework

Answers (1)

Related Questions