Ram
Ram

Reputation: 337

Resource allocation for MapReduce and Spark jobs in cluster

I'm unable to understand the internal mechanism of allocation of resources to Map Reduce and Spark jobs.

In the same cluster we can run Map Reduce and Spark jobs, however for running map reduce jobs internal resource manager will allocate available resources like data node and task trackers to the job. Internally job my required 'N' number of mappers and reducers.

When it comes to Spark context it needs worker nodes and executors(Internally JVM) to compute the program.

Is that mean there will be different nodes for Map Reduce and Spark jobs? If not how will the differentiation will happen between Task tracker and Executors. How will cluster manager identifies the specific node for Hadoop and Spark job.

Can someone enlighten me here.

Upvotes: 1

Views: 798

Answers (2)

Raktotpal Bordoloi
Raktotpal Bordoloi

Reputation: 1057

Task trackers or executors - all are daemons.

When an MR job is submitted, job-tracker service or resource-manger service allocates proper node-manager with required resource.

And when a spark job is submitted, the Application master acquires worker nodes where resource is available near data, and submits/deploys tasks on that node through the executor service.

It's just the different services/daemons of the underlying framework - whether MR or spark, that manages the whole job scheduling and starts JVM with appropriate resource in appropriate node.

Upvotes: 1

Wang
Wang

Reputation: 155

In my opinion,when running a spark program,it's spilted into several spark jobs and each job is spilted into several tasks.The task has kinds of types including map-reduce.Map-reduce is only a concrete computation process.

Upvotes: 0

Related Questions