Tripti
Tripti

Reputation: 85

Pig local vs mapreduce mode performance comparision

I have setup a 3 node Hadoop cluster with Cloudera manager CDH4. When ran a Pig job in mapreduce mode it took double the time than that of the local mode for same data set. Is that an expected behavior? Also is there any documentation available for performance tuning options for mapreduce jobs?

Thanks much for any help!

Upvotes: 1

Views: 630

Answers (3)

alexeipab
alexeipab

Reputation: 3619

Another reason is when you run in -x local mode, Pig does not do the same jar compilations as it does for map reduce mode. With small data sets and complex pig script the actual jar compilation time becomes noticeable.

Upvotes: 0

Andrey Sozykin
Andrey Sozykin

Reputation: 926

A good start for performance tuning is the "Making Pig Fly" chapter from the "Programming Pig" book.

Upvotes: 0

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25909

This is probably because you are using a toy dataset and the overhead of mapreduce is larger than the benefit of parallelization

Upvotes: 1

Related Questions