ableHercules
ableHercules

Reputation: 660

yarn is using 100% resources when running a hive job

I'm running a hive tez job. the job is to load the data from one table which is of text file format to another table with orc format.

I'm using

INSERT INTO TABLE ORDERREQUEST_ORC 
PARTITION(DATE)
SELECT 
COLUMN1, 
COLUMN2, 
COLUMN3,
DATE
FROM ORDERREQUEST_TXT; 

When I'm monitoring the job through ambari web console I saw that YARN memory utilized is 100%.

can you please advice how to maintain Healthy Yarn memory.

the load average on all the three datanodes;

 1. top - 17:37:24 up 50 days, 3:47, 4 users, load average: 15.73, 16.43, 13.52 
 2. top - 17:38:25 up 50 days, 3:48, 2 users, load average: 16.14, 15.19, 12.50 
 3. top - 17:39:26 up 50 days, 3:49, 1 user, load average: 11.89, 12.54, 10.49 

These are the yarn configurations

 yarn.scheduler.minimum-allocation-mb=5120 
 yarn.scheduler.maximum-allocation-mb=46080 
 yarn.nodemanager.resource.memory-mb=46080

FYI:- My cluster config

 Nodes = 4 (1 Master, 3 DN ) 
 memory = 64 GB on each node 
 Processors = 6 on each node 
 1 TB on each node (5 Disk * 200 GB)

How to reduce the yarn utilization memory?

Upvotes: 1

Views: 5872

Answers (1)

sree
sree

Reputation: 1960

you are getting the error because the cluster hasn't been configured to allocate max yarn memory per user.

Please set the below properties in Yarn configurations to allocate 33% of max yarn memory per job, which can be altered based on your requirement.

Change from:

yarn.scheduler.capacity.root.default.user-limit-factor=1

To:

yarn.scheduler.capacity.root.default.user-limit-factor=0.33

If you need further info on this, please refer following link https://analyticsanvil.wordpress.com/2015/08/16/managing-yarn-memory-with-multiple-hive-users/

Upvotes: 5

Related Questions