Willis Zhang
Willis Zhang

Reputation: 109

Create Hive External Table on S3 throws "org.apache.hadoop.fs.s3a.S3AFileSystem not found" Exception

I'm using beeline on my local machine to run below DDL, and throws the exception.

the DDL is

CREATE TABLE `report_landing_pages`( 
  `google_account_id` string COMMENT 'from deserializer', 
  `ga_view_id` string COMMENT 'from deserializer', 
  `path` string COMMENT 'from deserializer', 
  `users` string COMMENT 'from deserializer', 
  `page_views` string COMMENT 'from deserializer', 
  `event_value` string COMMENT 'from deserializer', 
  `report_date` string COMMENT 'from deserializer') 
PARTITIONED BY (`dt` date) 
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
STORED AS TEXTFILE
LOCATION 's3a://bucket_name/table'

the exception is

org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found) at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257) at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:862) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:867) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4356) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1232) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:255) ... 11 more Caused by: MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42070) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42038) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:41964) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1199) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1185) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2399) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:93) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:752) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:740) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) at com.sun.proxy.$Proxy34.createTable(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2330) at com.sun.proxy.$Proxy34.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:852) ... 22 more

And my local HDFS works fine with "hdfs dfs -mkdir s3a://bucket/table"

And the wierd thing is, if I created the table not on S3 first, and then manually update the location of the table to s3 in metastore later, the select statement like select COUNT(*) from report_landing_pages group by google_account_id works fine.

How to fix exception in DDL?

BTW, I'm running with Hive 2.3.2, Hadoop 2.7.5 under MacOS X EI Caption.

Upvotes: 0

Views: 4444

Answers (4)

Arun
Arun

Reputation: 760

Yes, that's correct. After placing the jar, the hive metastore service needs to be restarted

To restart the hive metastore, below steps needs to be followed:

1. ps -ef | grep 'hive'

Use above command to identify the Process ID (PID) that hive is using. By default, the hive used 9083 port number, so for a running hive metastore service, you can also check the PID using lsof -i:9083 command.

2. kill <process number>

This kills the existing hive metastore service.

3. hive --service metastore

This command starts the metastore service again.

OR If you are using hive-server 2, use the following command:

$ sudo /etc/init.d/hive-metastore start
$ sudo /etc/init.d/hive-metastore stop

Upvotes: 0

Willis Zhang
Willis Zhang

Reputation: 109

Problem solved.

After place the S3 jars, the metastore service should also be restarted, besides hiveserver2.

Upvotes: 1

pavan kumar
pavan kumar

Reputation: 102

I tried creating the same table in my environment and it worked.

Check your : fs.s3a.access.key

"fs.s3a.secret.key","fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem" 

properties in the configuration file.

Upvotes: 0

pavan kumar
pavan kumar

Reputation: 102

Did you place all the required jars s3 etc to the classpath? (org.apache.hadoop.fs.s3a.S3AFileSystem)- this is a Hadoop class, found in the Hadoop-aws jar. An exception reporting one of these classes is missing means that this jar is not in the class path.

Upvotes: 0

Related Questions