user3123372
user3123372

Reputation: 744

MapReduce real life uses

I have a doubt that in which cases , MapReduce is chosen over hive or pig.

I know that it is used when

  1. We need indepth filtering of the input data.
  2. working with unstructured data.
  3. Working with graph. .... But is there any place where we cant use hive, pig or we can work much better with MapReduce and it is used highly in real projects

Upvotes: 0

Views: 220

Answers (2)

ramblingpolak
ramblingpolak

Reputation: 622

Bare MapReduce is not written very often these days. Higher level abstractions such as the two you mentioned are more popular and adequate for query workloads.

Even in scenarios where HiveQL is too restrictive one might seek alternatives such as Cascading or Scalding for low-level batch jobs or the ever more popular Spark.

A primary motivation of using these high level abstractions is because most applications require a sequence of map and reduce phase which the MapReduce APIs leave you on your own to figure out how to serialize data between tasks.

Upvotes: 0

Durga Viswanath Gadiraju
Durga Viswanath Gadiraju

Reputation: 3966

Hive and Pig are generic solutions and they will have overhead while processing the data. Most of the scenarios it is negligible but in some cases it can be considerable.

If there are many tables that needs to be joined, using Hive and Pig tries to apply generic solution, if you use map reduce after understanding the data, you can come up with more optimal solution.

However map reduce should be treated as kernel. If your solution can be reused else where, it will be better to develop it using map reduce and integrate with Hive/Pig/Sqoop.

Pig can be used to process unstructured data. It will give more flexibility than Hive while processing the data.

Upvotes: 1

Related Questions