Nicholas Liu
Nicholas Liu

Reputation: 125

How to address S3 error: org.jets3t.service.S3ServiceException: S3 GET failed? Java

I am trying to get and read a parquet file on S3 with the Apache Parquet Reader, and my code looks something like this:

ParquetReader<GenericData.Record> reader = null;
Path internalPath = new Path("s3://S3AccessID:S3SecretKey@bucketName/tmp0.parquet");
try {
            InputFile inputFile = HadoopInputFile.fromPath(internalPath, new Configuration());
            reader = AvroParquetReader.<GenericData.Record>builder(inputFile).build();
            GenericData.Record record;
            while ((record = reader.read()) != null) {
                System.out.println(record);
            }
}

However, when I build and run a program, this is the error screen:

        at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:156)
        at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.retrieveINode(Jets3tFileSystemStore.java:195)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:567)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
        at com.sun.proxy.$Proxy12.retrieveINode(Unknown Source)
        at org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:332)
        at org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
        at read.read.readParquetFile(read.java:153)
        at read.read.main(read.java:80)
Caused by: org.jets3t.service.S3ServiceException: S3 GET failed for '/%2Ftmp0.parquet' XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidRequest</Code><Message>The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.</Message><RequestId>1A66095653EBAD50</RequestId><HostId>jNzbaMmKmszHiLvzA4NsqILRxF+qJFxJLTWvKVwqHoggB0MnYy1ESoajHaa/Ufs5RE8ghs31Jaw=</HostId>

Does anyone have any idea how to address this?

Upvotes: 1

Views: 3361

Answers (1)

franklinsijo
franklinsijo

Reputation: 18270

From the error message, it looks like your S3 bucket region uses Signature Version 4 (v4) signing protocol and does not support the older version (v2).

The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.

In that case, you must set the property fs.s3a.endpoint either in core-site.xml or in the Job configuration. The value for this property can be found here under Amazon S3 Endpoints.

Additionally,

  1. Use hadoop's s3a client instead of s3.

  2. Rather than embedding the access_key and secret_access_key in the s3a URL, use these properties fs.s3a.access.key and fs.s3a.secret.key. The entire list of properties that can be used for S3 authentication can be found here.

Upvotes: 4

Related Questions