user3605176
user3605176

Reputation: 39

Task that cannot be done through Map/Reduce

I have a question that is there any task that cannot be accomplished via Map/Reduce. Because as per the definition of Map Reduce i.e. A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. This implies that dependent tasks cannot be performed by Map Reduce. Am I right? and kindly give me example.

Upvotes: 0

Views: 1906

Answers (2)

Joe K
Joe K

Reputation: 18424

I think this is not quite the right question. You should ask not which tasks are impossible in MapReduce, but which tasks are inefficient in MapReduce.

At the most basic level, if you have an algorithm, you can just execute it in a single mapper. Set the job to only use 1 mapper and no reducers. Thus, it's kind of vacuously true that MapReduce can compute any task.

However, there are many tasks that do not translate well into MapReduce jobs. Graph algorithms and iterative algorithms (think k-means) are notoriously hard to do in MapReduce. They are possible, but they would take jumping through serious hoops, or maybe accepting an approximate answer.

To continue with the k-means example, you could, I suppose, have each iteration of the loop happen in one MapReduce job. i.e., Mappers assign points to clusters, reducers recalculate means. This is "jumping through hoops" in that you have to communicate between MapReduce jobs, and you could end up running many jobs: not efficient.

Alternatively, you could also have each mapper do k-means on a subset, then have a single reducer do k-means with the means from the mappers as input. You would get an approximate answer, but it would be efficient and translate naturally to MapReduce.

Upvotes: 3

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25909

You can do dependent tasks with map reduce - you "just" can't do them efficiently. for example graph related algorithms.

Do note that with Hadoop 2 and YARN you can run alterternatives to map/reduce such as Tez and Spark that enables better handling of such algorithms

Upvotes: 0

Related Questions