Whilda Chaq
Whilda Chaq

Reputation: 364

Compare MapReduce Performance

I already installed hadoop mapreduce in one node and I have a top ten problem.

Let's say I have a 10k pair data (key,value) and search 10 data with the best value.

Actually, I create a simple project to iterate whole data and I need just a couple minute to got the answer.

then, I create mapreduce application with top ten design pattern to solve same problem, and I need more than 4 hour to get the answer. (obviously, I use the same machine and same algorithm to sort)

I think, that probably happens because mapreduce need more service to run, need more network activity, need more effort to read and write to hdfs. Any other's factor to prove that mapreduce (in that condition) is slower than not using mapreduce?

Upvotes: 2

Views: 660

Answers (1)

Antariksha Yelkawar
Antariksha Yelkawar

Reputation: 403

mapreduce is slower on a single node setup because only one mapper and one reducer can work on it at any given time. mapper has to iterate through the each one of the splits and the reducer works on two mapper outputs simultaneously and then on two such reducer out puts ans so on..

so In terms of complexity:

for normal project :t(n) = n => O(n)
for mapreduce:t(n) = (n/x)*t(n/2x) => O((n/x)log(n/x)) where x is the number of nodes

which do you think is bigger? for single node and multinode..

explanation for mapreduce complexity:

time for one iteration: n

number of simultaneous map function: x since only one can work on each node

then time required for mapping complete data: n/x since n is the time 1 mapper takes for complete data

for reduce job half of the time is required as compared to the previous map since it works on two mapper outputs simultaneously therefore: time = n/2x for x reducers on x nodes

hence the equation that every next step will take half the time than the previous one.

t(n) = (n/x)*t(n/2x)

solving this recursion we get, O((n/x)log(n/x)).

this is not supposed to be exact but an approximation

Upvotes: 3

Related Questions