Reputation: 11
I have set up Cloudera Hue
and have a cluster of master node of 200 Gib and 16 Gib RAM and 3 datnodes of each 150 Gib and 8 Gib Ram.
I have database of size 70 Gib approx. The problem is when I try to run Hive queries from hive editor(HUE GUI
). If I submit 5 to 6 queries(for execution) Jobs are started but they hang and never run. How can I run the queries sequentially. I mean even though I can submit queries but the new query should only start when previous is completed. Is there any way so that I can make the queries run one by one?
Upvotes: 0
Views: 10287
Reputation: 3845
You can run all your queries in one go and by separating them using ';' in HUE.
For example:
Query1; Query2; Query3
In this case query1, query2 and query3 will run sequentially one after another
Upvotes: 1
Reputation: 11
so the entire flow of YARN/MR2 is as follow
so now resource allocation is handled by YARN.in case of cloudera cluster, Dynamic resource pools(kind of a queue) is the place where jobs are submitted and and then resource allocation is done by yarn for these jobs. by default the value of maximum concurrent jobs is set in such a way that resource manager allocates all the resource to all the jobs/Application masters leaving no space for task containers(which is required at later stage for running tasks by application masters.)
so if we submit large no of queries in HUE Hive editor for execution they will be submitted as jobs concurrently and application masters for them will be allocated resources leaving no space for task containers and thus all jobs will be in pending state.
Solution is as mentioned above by @Romain
set the value of max no of concurrent jobs accordingly to the size and capability of cluster. in my case it worked for the value of 4 now only 4 jobs will be run concurrently from the pool and they will be allocated resources by the resource manager.
Upvotes: 0
Reputation: 7082
Hue submits all the queries, if they hang, it means that you are probably hitting a misconfiguration in YARN, like gotcha #5 http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
Upvotes: 0