HHH
HHH

Reputation: 6465

How to submit multiple jobs to a hadoop cluster

I have a hadoop cluster which runs Hadoop 2.6. I'd like to submit multiple jobs to it in parallel. I'd like to know whether I should simply submit multiple jobs and let the cluster handle the rest or I should write them as a yarn application. As a matter of fact I'm not very familiar with Yarn application development and know exactly know how it is different from a regular Hadoop application.

Upvotes: 0

Views: 770

Answers (2)

alekya reddy
alekya reddy

Reputation: 934

You can run the MR jobs both by using the MR1 and YARN. YARN has nothing to do with job parallelism. It is just a framework for running various kinds of jobs.

Use oozie workflows or shell scripts to run the jobs in parallel.

Upvotes: 1

InfamousCoconut
InfamousCoconut

Reputation: 794

You can define oozie workflow with mapreduce jobs being forked. Following is the example from apache oozie documentation for the same.

<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1">
    ...
    <fork name="forking">
        <path start="firstparalleljob"/>
        <path start="secondparalleljob"/>
    </fork>
    <action name="firstparallejob">
        <map-reduce>
            <job-tracker>foo:9001</job-tracker>
            <name-node>bar:9000</name-node>
            <job-xml>job1.xml</job-xml>
        </map-reduce>
        <ok to="joining"/>
        <error to="kill"/>
    </action>
    <action name="secondparalleljob">
        <map-reduce>
            <job-tracker>foo:9001</job-tracker>
            <name-node>bar:9000</name-node>
            <job-xml>job2.xml</job-xml>
        </map-reduce>
        <ok to="joining"/>
        <error to="kill"/>
    </action>
    <join name="joining" to="nextaction"/>
    ...
</workflow-app>

Upvotes: 0

Related Questions