Noah
Noah

Reputation: 1

Hive LLAP Daemon Issues

I have a small Hadoop cluster with a total of 10 nodes, each equipped with 96 cores, 1 TB of RAM, and 32 TB of SSD storage. I'm having trouble enabling Hive LLAP due to the lack of comprehensive documentation. I’ve experimented with various parameter settings and different combinations to run LLAP daemons, but none of them have worked. Here are my cluster configurations:

# hive-site.xml
<property>
    <name>hive.llap.execution.mode</name>
    <value>all</value>
  </property>
  <property>
    <name>hive.execution.mode</name>
    <value>llap</value>
  </property>
  <property>
    <name>hive.llap.daemon.yarn.container.mb</name>
    <value>419840</value>
  </property>
  <property>
    <name>hive.llap.io.memory.size</name>
    <value>41984</value>
  </property>
  <property>
      <name>hive.llap.daemon.service.hosts</name>
      <value>@llap0</value>
  </property>
  <property>
      <name>hive.llap.daemon.num.executors</name>
      <value>8</value>
  </property>
  <property>
      <name>hive.llap.zk.registry.user</name>
      <value>llap</value>
  </property>
  <property>
    <name>hive.llap.daemon.queue.name</name>
    <value>llap</value>
  </property>
# fair-scheduler.xml
<?xml version="1.0"?>
<allocations>
  <pool name="hadoop">
    <minResources>8192mb,1vcores</minResources>
    <maxResources>8396800mb,960vcores</maxResources>
    <maxRunningApps>1000</maxRunningApps>
    <weight>1.0</weight>
    <fairSharePreemptionThreshold>0.7</fairSharePreemptionThreshold>
    <fairSharePreemptionTimeout>5</fairSharePreemptionTimeout>
  </pool>
  <pool name="llap">
    <minResources>8192mb,1vcores</minResources>
    <maxResources>4198400mb,480vcores</maxResources>
    <maxRunningApps>1000</maxRunningApps>
    <weight>2.0</weight>
    <fairSharePreemptionThreshold>0.5</fairSharePreemptionThreshold>
    <fairSharePreemptionTimeout>5</fairSharePreemptionTimeout>
  </pool>
</allocations>

To start LLAP daemons I run below command:

hive --service llap --name llap0 --instances 1 --loglevel debug --cache 41984m --executors 8 --iothreads 2 --size 419840m --xmx 41984m --startImmediately --javaHome $JAVA_HOME

The following command creates an application as a YARN service with the --startImmediately parameter. It appears that the application starts successfully, but when I submit a query, I receive the error shown below:

killed/failed due to:INIT_FAILURE, Fail to create InputInitializerManager, org.apache.tez.dag.api.TezReflectionException: Unable to instantiate class with 1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator
  at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68)
  at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:86)
  at org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:155)
  at org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:151)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
  at org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:151)
  at org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:123)
  at org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:4315)
  at org.apache.tez.dag.app.dag.impl.VertexImpl.access$3200(VertexImpl.java:216)
  at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:3089)
  at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:3036)
  at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:3018)
  at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
  at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
  at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
  at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
  at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)
  at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:2079)
  at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:215)
  at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2245)
  at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2231)
  at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:195)
  at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)
  at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.reflect.InvocationTargetException
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:65)
  ... 25 more
Caused by: java.lang.IllegalArgumentException: No running LLAP daemons! Please check LLAP service status and zookeeper configuration
  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
  at org.apache.hadoop.hive.ql.exec.tez.Utils.getSplitLocationProvider(Utils.java:57)
  at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:140)
  ... 30 more

Vertex killed, vertexName=Reducer 2, vertexId=vertex_1724140529947_0103_2_01, diagnostics=
» Vertex received Kill in NEW state., Vertex vertex_1724140529947_0103_2_01
» Reducer 2
killed/failed due to:OTHER_VERTEX_FAILURE

DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1

Additionally, the command creates a folder containing a run.sh script. However, when I attempt to run it, I encounter the following error:

ERROR client.ApiServiceClient: Fail to launch application:
java.io.IOException:
        at org.apache.hadoop.yarn.service.client.ApiServiceClient.getRMWebAddress(ApiServiceClient.java:153)
        at org.apache.hadoop.yarn.service.client.ApiServiceClient.getServicePath(ApiServiceClient.java:171)
        at org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:235)
        at org.apache.hadoop.yarn.service.client.ApiServiceClient.actionLaunch(ApiServiceClient.java:380)
        at org.apache.hadoop.yarn.client.cli.ApplicationCLI.executeLaunchCommand(ApplicationCLI.java:1265)
        at org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:198)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)
        at org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:128)

In the application logs I got below error:

ERROR [main-EventThread ()] org.apache.curator.framework.imps.EnsembleTracker: Invalid config event received: {server.1=xxx.xxx.xxx.xxx:2888:3888:participant, version=0, server.3=xxx.xxx.xxx.xxx:2888:3888:participant, server.2=xxx.xxx.xxx.xxx:2888:3888:participant}
INFO  [main-EventThread ()] org.apache.curator.framework.imps.EnsembleTracker: New config event received: {server.1=xxx.xxx.xxx.xxx:2888:3888:participant, version=0, server.3=xxx.xxx.xxx.xxx:2888:3888:participant, server.2=xxx.xxx.xxx.xxx:2888:3888:participant}
ERROR [main-EventThread ()] org.apache.curator.framework.imps.EnsembleTracker: Invalid config event received: {server.1=xxx.xxx.xxx.xxx:2888:3888:participant, version=0, server.3=xxx.xxx.xxx.xxx:2888:3888:participant, server.2=xxx.xxx.xxx.xxx:2888:3888:participant}
2024-08-19T14:53:54,675 ERROR [main-EventThread ()] org.apache.zookeeper.ClientCnxn: Unexpected throwable
java.lang.NoClassDefFoundError: org/apache/zookeeper/proto/GetAllChildrenNumberResponse
    at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:654) ~[zookeeper-3.7.1.jar:3.7.1]
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:553) ~[zookeeper-3.7.1.jar:3.7.1]
Caused by: java.lang.ClassNotFoundException: org.apache.zookeeper.proto.GetAllChildrenNumberResponse
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387) ~[?:1.8.0_371]
    at java.lang.ClassLoader.loadClass(ClassLoader.java:436) ~[?:1.8.0_371]
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355) ~[?:1.8.0_371]
    at java.lang.ClassLoader.loadClass(ClassLoader.java:369) ~[?:1.8.0_371]
    ... 2 more

Can someone help me identify the main issue and how to resolve it? Any detailed instructions on how to properly enable Hive with LLAP would be greatly appreciated.

Upvotes: 0

Views: 58

Answers (0)

Related Questions