Reputation: 335

MapReduce jobs get stuck in Accepted state

I have my own MapReduce code that I'm trying to run, but it just stays at Accepted state. I tried running another sample MR job that I'd run previously and which was successful. But now, both the jobs stay in Accepted state. I tried changing various properties in the mapred-site.xml and yarn-site.xml as mentioned here and here but that didn't help either. Can someone please point out what could possibly be going wrong. I'm using hadoop-2.2.0

I've tried many values for the various properties, here is one set of values- In mapred-site.xml

<property>
<name>mapreduce.job.tracker</name>
<value>localhost:54311</value>
</property> 

<property>
<name>mapreduce.job.tracker.reserved.physicalmemory.mb</name>
<value></value>
</property>

<property>
<name>mapreduce.map.memory.mb</name>
<value>256</value>
</property>

<property>
<name>mapreduce.reduce.memory.mb</name>
<value>256</value>
</property>


<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>400</value>
<source>mapred-site.xml</source>
</property>

In yarn-site.xml

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>400</value>
<source>yarn-site.xml</source>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>.3</value>
</property>

Upvotes: 21

Answers (6)

Srinivas

Reputation: 311

Adding the property yarn.resourcemanager.hostname to the master node hostname in yarn-site.xml and copy this file to all the nodes in the cluster to reflect this configuration has solved the issue for me.

Upvotes: 0

Binita Bharati

Reputation: 5928

Am using Hadoop 3.0.1.I had faced the same issue where-in submitted map reduce job were shown as stuck in ACCEPTED state in ResourceManager web UI.Also, in the same ResourceManager web UI,under Cluster metrics -> Memory used was 0, Total Memory was 0; Cluster Node Metrics -> Active Nodes was 0, although NamedNode web UI listed the data nodes perfectly.Running yarn node -list on the cluster did not display any NodeManagers.Turns out, that my NodeManagers were not running.After starting the NodeManagers,the newly submitted map reduce jobs could proceed further.They were no more stuck in ACCEPTED state, and got to "RUNNING" state

Upvotes: 1

Manish Bansal

Reputation: 2680

I faced the same issue. And i changed every configuration mentioned in above answers but still it was no use. After this, i re-checked the health of my cluster. There, i observed that my one and only node was in un-healthy state. The issue was due to lack of disk space in my /tmp/hadoop-hadoopUser/nm-local-dir directory. Same can be checked by checking node health status at resource manager web UI at port 8032. To resolve this, i added below property in yarn-site.xml.

<property>
    <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
    <value>98.5</value>
</property>

After restarting my hadoop daemons, node status got changed to healthy and jobs started to run

Upvotes: 0

secfree

Reputation: 4677

A job stuck in accepted state on YARN is usually because of free resources are not enough. You can check it at http://resourcemanager:port/cluster/scheduler:

if Memory Used + Memory Reserved >= Memory Total, memory is not enough
if VCores Used + VCores Reserved >= VCores Total, VCores is not enough

It may also be limited by parameters such as maxAMShare.

Upvotes: 10

Romain Jouin

Reputation: 4848

Had the same issue, and for me it was a full hard drive (>90% full) which was the issue. Cleaning space saved me.

Upvotes: 10

Niels Basjes

Reputation: 10652

I've had the same effect and found that making the system have more memory available per worker node and reduce the memory required for an application helped.

The settings I have (on my very small experimental boxes) in my yarn-site.xml:

<property>
  <name>yarn.nodemanager.resource.memory-mb</name>
  <value>2200</value>
  <description>Amount of physical memory, in MB, that can be allocated for containers.</description>
</property>

<property>
  <name>yarn.scheduler.minimum-allocation-mb</name>
  <value>500</value>
</property>

Upvotes: 14

MapReduce jobs get stuck in Accepted state

Answers (6)

Related Questions