nikosdi
nikosdi

Reputation: 2158

Map Reduce Keep input ordering

I tried to implement an application using hadoop which processes text files.The problem is that I cannot keep the ordering of the input text.Is there any way to choose the hash function?This problem could be easily solved by assigning a partition of the input to each mapper an then send the partition to the reducers.Is this possible with hadoop ?

Upvotes: 5

Views: 3469

Answers (1)

Niels Basjes
Niels Basjes

Reputation: 10642

The base idea of MapReduce is that the order in which things are done is irrelevant. So you cannot (and do not need to) control the order in which:

  • the input records go through the mappers.
  • the key and related values go through the reducers.

The only thing you can control is the order in which the values are placed in the iterator that is made available in the reducer. This is done using a construct called "secondary sort".

A simple google action for this term resulted in several points where you can continue. I like this blog post : link

Upvotes: 2

Related Questions