Reputation: 660
I'm running a hive tez job. the job is to load the data from one table which is of text file format to another table with orc format.
I'm using
INSERT INTO TABLE ORDERREQUEST_ORC
PARTITION(DATE)
SELECT
COLUMN1,
COLUMN2,
COLUMN3,
DATE
FROM ORDERREQUEST_TXT;
When I'm monitoring the job through ambari web console I saw that YARN memory utilized is 100%.
can you please advice how to maintain Healthy Yarn memory.
the load average on all the three datanodes;
1. top - 17:37:24 up 50 days, 3:47, 4 users, load average: 15.73, 16.43, 13.52
2. top - 17:38:25 up 50 days, 3:48, 2 users, load average: 16.14, 15.19, 12.50
3. top - 17:39:26 up 50 days, 3:49, 1 user, load average: 11.89, 12.54, 10.49
These are the yarn configurations
yarn.scheduler.minimum-allocation-mb=5120
yarn.scheduler.maximum-allocation-mb=46080
yarn.nodemanager.resource.memory-mb=46080
FYI:- My cluster config
Nodes = 4 (1 Master, 3 DN )
memory = 64 GB on each node
Processors = 6 on each node
1 TB on each node (5 Disk * 200 GB)
How to reduce the yarn utilization memory?
Upvotes: 1
Views: 5872
Reputation: 1960
you are getting the error because the cluster hasn't been configured to allocate max yarn memory per user.
Please set the below properties in Yarn configurations to allocate 33% of max yarn memory per job, which can be altered based on your requirement.
Change from:
yarn.scheduler.capacity.root.default.user-limit-factor=1
To:
yarn.scheduler.capacity.root.default.user-limit-factor=0.33
If you need further info on this, please refer following link https://analyticsanvil.wordpress.com/2015/08/16/managing-yarn-memory-with-multiple-hive-users/
Upvotes: 5