Anupam Bansal
Anupam Bansal

Reputation: 17

Big Data File Processing in Map Reduce

I am trying to understand how does the Map Reduce work in general. So what I know is that there are Mappers that run in parallel over several computers and create a resultset which is then used by Reducers running in parallel over several machines to create the intended data set.

My questions are:

Upvotes: 0

Views: 284

Answers (1)

SSaikia_JtheRocker
SSaikia_JtheRocker

Reputation: 5063

Answers:

  1. Yes. Basically a job starts, process files and ends. No running forever.

  2. Stream processing can be handled by Storm or similar technologies but not Hadoop alone, since it's a batch processing system. You can also look for how Hadoop Yarn and Storm can work together.

  3. The should be a point of reference, because tasktracker running in different nodes sends status info of different tasks (Map tasks /Reduce tasks) being run periodically to the jobtracker, which co-ordinates the job run.

Upvotes: 1

Related Questions