Reputation: 347
I'm facing an Issue with Hive over Tez.
I can select a table exist on Hive without any issue
SELECT * FROM Transactions;
But when trying to use aggregate functions on this tables or counting (*) like:
SELECT COUNT(*) FROM Transactions;
I'm Facing the below Log in Hive.log file
2017-08-13T10:04:27,892 INFO [4a5b6a0c-9edb-45ea-8d49-b2f4b0d2b636 main] conf.HiveConf: Using the default value passed in for log id: 4a5b6a0c-9edb-45ea-8d49-b2f4b0d2b636 2017-08-13T10:04:27,910 INFO [4a5b6a0c-9edb-45ea-8d49-b2f4b0d2b636 main] session.SessionState: Error closing tez session java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1498057873641_0017 failed 2 times due to AM Container for appattempt_1498057873641_0017_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: java.io.FileNotFoundException: File /tmp/hadoop-hadoop/nm-local-dir/filecache does not exist For more detailed output, check the application tracking page: http://hadoop-master:8090/cluster/app/application_1498057873641_0017 Then click on links to logs of each attempt. . Failing the application. at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.isOpen(TezSessionState.java:173) ~[hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.toString(TezSessionState.java:135) ~[hive-exec-2.1.1.jar:2.1.1] at java.lang.String.valueOf(String.java:2994) ~[?:1.8.0_131] at java.lang.StringBuilder.append(StringBuilder.java:131) ~[?:1.8.0_131] at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeIfNotDefault(TezSessionPoolManager.java:346) ~[hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1524) [hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66) [hive-cli-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:133) [hive-cli-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) [hive-cli-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) [hive-cli-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) [hive-cli-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) [hive-cli-2.1.1.jar:2.1.1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_131] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131] at org.apache.hadoop.util.RunJar.run(RunJar.java:234) [hadoop-common-2.8.0.jar:?] at org.apache.hadoop.util.RunJar.main(RunJar.java:148) [hadoop-common-2.8.0.jar:?] Caused by: java.util.concurrent.ExecutionException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1498057873641_0017 failed 2 times due to AM Container for appattempt_1498057873641_0017_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: java.io.FileNotFoundException: File /tmp/hadoop-hadoop/nm-local-dir/filecache does not exist For more detailed output, check the application tracking page: http://hadoop-master:8090/cluster/app/application_1498057873641_0017 Then click on links to logs of each attempt. . Failing the application. at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_131] at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_131] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.isOpen(TezSessionState.java:168) ~[hive-exec-2.1.1.jar:2.1.1] ... 17 more Caused by: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1498057873641_0017 failed 2 times due to AM Container for appattempt_1498057873641_0017_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: java.io.FileNotFoundException: File /tmp/hadoop-hadoop/nm-local-dir/filecache does not exist For more detailed output, check the application tracking page: http://hadoop-master:8090/cluster/app/application_1498057873641_0017 Then click on links to logs of each attempt. . Failing the application. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:914) ~[tez-api-0.8.4.jar:0.8.4] at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:883) ~[tez-api-0.8.4.jar:0.8.4] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:416) ~[hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.access$000(TezSessionState.java:97) ~[hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState$1.call(TezSessionState.java:333) ~[hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState$1.call(TezSessionState.java:329) ~[hive-exec-2.1.1.jar:2.1.1] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_131] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131]
I Solved this issue by creating the missed Directory on all Cluster nodes "/tmp/hadoop-hadoop/nm-local-dir/filecache".
Then I got an error at Hive.log when trying to do SELECT COUNT(*) FROM Transactions;
, as below:
2017-08-13T10:06:35,567 INFO [main] optimizer.ColumnPrunerProcFactory: RS 3 oldColExprMap: {VALUE._col0=Column[_col0]} 2017-08-13T10:06:35,568 INFO [main] optimizer.ColumnPrunerProcFactory: RS 3 newColExprMap: {VALUE._col0=Column[_col0]} 2017-08-13T10:06:35,604 INFO [213ea036-8245-4042-a5a1-ccd686ea2465 main] Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 2017-08-13T10:06:35,658 INFO [main] annotation.StatsRulesProcFactory: STATS-GBY[2]: Equals 0 in number of rows.0 rows will be set to 1 2017-08-13T10:06:35,679 INFO [main] optimizer.SetReducerParallelism: Number of reducers determined to be: 1 2017-08-13T10:06:35,680 INFO [main] parse.TezCompiler: Cycle free: true 2017-08-13T10:06:35,689 INFO [213ea036-8245-4042-a5a1-ccd686ea2465 main] Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 2017-08-13T10:06:35,741 INFO [main] parse.CalcitePlanner: Completed plan generation 2017-08-13T10:06:35,742 INFO [main] ql.Driver: Semantic Analysis Completed 2017-08-13T10:06:35,742 INFO [main] ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:c0, type:bigint, comment:null)], properties:null) 2017-08-13T10:06:35,744 INFO [main] exec.ListSinkOperator: Initializing operator LIST_SINK[7] 2017-08-13T10:06:35,745 INFO [main] ql.Driver: Completed compiling command(queryId=hadoop_20170813100633_31ca0425-6aca-434c-8039-48bc0e761095); Time taken: 2.131 seconds 2017-08-13T10:06:35,768 INFO [main] ql.Driver: Executing command(queryId=hadoop_20170813100633_31ca0425-6aca-434c-8039-48bc0e761095): select count(*) from transactions 2017-08-13T10:06:35,768 INFO [main] ql.Driver: Query ID = hadoop_20170813100633_31ca0425-6aca-434c-8039-48bc0e761095 2017-08-13T10:06:35,768 INFO [main] ql.Driver: Total jobs = 1 2017-08-13T10:06:35,784 INFO [main] ql.Driver: Launching Job 1 out of 1 2017-08-13T10:06:35,784 INFO [main] ql.Driver: Starting task [Stage-1:MAPRED] in serial mode 2017-08-13T10:06:35,789 INFO [main] tez.TezSessionPoolManager: The current user: hadoop, session user: hadoop 2017-08-13T10:06:35,789 INFO [main] tez.TezSessionPoolManager: Current queue name is null incoming queue name is null 2017-08-13T10:06:35,838 INFO [213ea036-8245-4042-a5a1-ccd686ea2465 main] Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed 2017-08-13T10:06:35,840 INFO [main] ql.Context: New scratch dir is hdfs://hadoop-master:8020/tmp/hive/hadoop/213ea036-8245-4042-a5a1-ccd686ea2465/hive_2017-08-13_10-06-33_614_5648783469307420794-1 2017-08-13T10:06:35,845 INFO [main] exec.Task: Session is already open 2017-08-13T10:06:35,847 INFO [main] tez.DagUtils: Localizing resource because it does not exist: file:/opt/apache-tez-0.8.4-bin to dest: hdfs://hadoop-master:8020/tmp/hive/hadoop/_tez_session_dir/213ea036-8245-4042-a5a1-ccd686ea2465/apache-tez-0.8.4-bin 2017-08-13T10:06:35,850 INFO [main] tez.DagUtils: Looks like another thread or process is writing the same file 2017-08-13T10:06:35,851 INFO [main] tez.DagUtils: Waiting for the file hdfs://hadoop-master:8020/tmp/hive/hadoop/_tez_session_dir/213ea036-8245-4042-a5a1-ccd686ea2465/apache-tez-0.8.4-bin (5 attempts, with 5000ms interval) 2017-08-13T10:07:00,860 ERROR [main] tez.DagUtils: Could not find the jar that was being uploaded 2017-08-13T10:07:00,861 ERROR [main] exec.Task: Failed to execute tez graph. java.io.IOException: Previous writer likely failed to write hdfs://hadoop-master:8020/tmp/hive/hadoop/_tez_session_dir/213ea036-8245-4042-a5a1-ccd686ea2465/apache-tez-0.8.4-bin. Failing because I am unlikely to write too. at org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1022) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:902) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:845) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(TezSessionState.java:466) at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:155) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) 2017-08-13T10:07:00,880 ERROR [main] ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
I Checked this Jira Issue for Hive problem "https://issues.apache.org/jira/browse/AMBARI-9821" but still facing this error when trying to do Count(*) from This table.
Tez Conf File:
<configuration>
<property>
<name>tez.lib.uris</name>
<value>hdfs://hadoop-master:8020/user/tez/apache-tez-0.8.4-bin/share/tez.tar.gz</value>
<type>string</type>
</property>
</configuration>
Hive Conf File:
<configuration>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10001</value>
</property>
<property>
<name>hive.server2.thrift.http.min.worker.threads</name>
<value>5</value>
</property>
<property>
<name>hive.server2.thrift.http.max.worker.threads</name>
<value>500</value>
</property>
<property>
<name>hive.server2.thrift.http.path</name>
<value>cliservice</value>
</property>
<property>
<name>hive.server2.thrift.min.worker.threads</name>
<value>5</value>
</property>
<property>
<name>hive.server2.thrift.max.worker.threads</name>
<value>500</value>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>http</value>
<description>Server transport mode. "binary" or "http".</description>
</property>
<property>
<name>hive.server2.allow.user.substitution</name>
<value>true</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>NONE</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>10.100.38.136</value>
</property>
<property>
<name>hive.support.concurrency</name>
<description>Enable Hive's Table Lock Manager Service</description>
<value>true</value>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<description>Zookeeper quorum used by Hive's Table Lock Manager</description>
<value>hadoop-master,hadoop-slave1,hadoop-slave2,hadoop-slave3,hadoop-slave4,hadoop-slave5</value>
</property>
<property>
<name>hive.zookeeper.client.port</name>
<value>2181</value>
<description>The port at which the clients will connect.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://hadoop-master:1527/metastore_db2</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.server2.webui.host</name>
<value>10.100.38.136</value>
</property>
<property>
<name>hive.server2.webui.port</name>
<value>10010</value>
</property>
<!--<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value/>
<value>thrift://hadoop-master:9083</value>
<value>file:///source/apache-hive-2.1.1-bin/bin/metastore_db/</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>-->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.ClientDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.PersistenceManagerFactoryClass</name>
<value>org.datanucleus.api.jdo.JDOPersistenceManagerFactory</value>
<description>class implementing the jdo persistence</description>
</property>
<property>
<name>datanucleus.autoStartMechanism</name>
<value>SchemaTable</value>
</property>
<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>APP</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mine</value>
</property>
<!--<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
<description>Creates necessary schema on a startup if one doesn't exist</description>
</property> -->
</configuration>
Also this is Diagnostics by Yarn:
Application application_1498057873641_0018 failed 2 times due to AM Container for appattempt_1498057873641_0018_000002 exited with exitCode: -103 Failing this attempt.Diagnostics: Container [pid=31779,containerID=container_1498057873641_0018_02_000001] is running beyond virtual memory limits. Current usage: 169.3 MB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1498057873641_0018_02_000001 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 31786 31779 31779 31779 (java) 587 61 2710179840 43031 /opt/jdk-8u131/jdk1.8.0_131/bin/java -Xmx819m -Djava.io.tmpdir=/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1498057873641_0018/container_1498057873641_0018_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/opt/hadoop/hadoop-2.8.0/logs/userlogs/application_1498057873641_0018/container_1498057873641_0018_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= org.apache.tez.dag.app.DAGAppMaster --session |- 31779 31777 31779 31779 (bash) 0 0 115838976 306 /bin/bash -c /opt/jdk-8u131/jdk1.8.0_131/bin/java -Xmx819m -Djava.io.tmpdir=/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1498057873641_0018/container_1498057873641_0018_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/opt/hadoop/hadoop-2.8.0/logs/userlogs/application_1498057873641_0018/container_1498057873641_0018_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster --session 1>/opt/hadoop/hadoop-2.8.0/logs/userlogs/application_1498057873641_0018/container_1498057873641_0018_02_000001/stdout 2>/opt/hadoop/hadoop-2.8.0/logs/userlogs/application_1498057873641_0018/container_1498057873641_0018_02_000001/stderr Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 For more detailed output, check the application tracking page: http://hadoop-master:8090/cluster/app/application_1498057873641_0018 Then click on links to logs of each attempt. . Failing the application.
Upvotes: 1
Views: 3691
Reputation: 2264
Most likely you are hitting https://issues.apache.org/jira/browse/HIVE-16398. As a workaround you will have to add following in /usr/hdp//hive/conf/hive-env.sh
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
if [ "${HIVE_AUX_JARS_PATH}" != "" ]; then
if [ -f "${HIVE_AUX_JARS_PATH}" ]; then
export HIVE_AUX_JARS_PATH=${HIVE_AUX_JARS_PATH}
elif [ -d "/usr/hdp/current/hive-webhcat/share/hcatalog" ]; then
export HIVE_AUX_JARS_PATH=/usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar
fi
elif [ -d "/usr/hdp/current/hive-webhcat/share/hcatalog" ]; then
export HIVE_AUX_JARS_PATH=/usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar
fi
Upvotes: 1