Why Oozie allocates more memory when running MapReduce jobs?

Question

I'm running MapReduce jobs using oozie. From workflow i'm just invoking MapReduce driver class and nothing other than that. But for this oozie workflow takes lot of memory. It needs minimum of 2GB container size to invoke the driver class. Below is workflow.xml





    
        ${jobTracker}
        ${nameNode}
        
            
                mapred.job.queue.name
                ${jobQueue}
            
        
        ${jobScript}
        ${arguments}
        ${queueName}
        ${wf:id()}
        myPath/MyDriver.sh#MyDriver.sh
    
    
    


    Job failed
        failed:[${wf:errorMessage(wf:lastErrorNode())}]

My shell script will look like below(MyDriver.sh),

hadoop jar myJar.jar MyDriverClass $1 $2 $3

Why oozie takes so much memory. How to reduce memory consumption of oozie?

Ruslan Ostafiichuk · Accepted Answer

Shell action will start at least 2 mappers to run your java class.

You can avoid this using java action. Put your jar inside ${workflow-path}/lib/ directory and change your workflow:


    
        ${jobTracker}
        ${nameNode}
        
            
                mapred.job.queue.name
                ${jobQueue}
            
        
        MyDriverClass

        ${arguments}
        ${queueName}
        ${wf:id()}

Why Oozie allocates more memory when running MapReduce jobs?

Answers (1)

Related Questions