Write to S3 from Spark without access and secret keys

Question

Our EC2 server is configured to allow access to my-bucket when using DefaultAWSCredentialsProviderChain, so the following code using plain AWS SDK works fine:

AmazonS3 s3client = new AmazonS3Client(new DefaultAWSCredentialsProviderChain());
s3client.putObject(new PutObjectRequest("my-bucket", "my-object", "/path/to/my-file.txt"));

Spark's S3AOutputStream uses the same SDK internally, however trying to upload a file without providing acces and secret keys doesn't work:

sc.hadoopConfiguration().set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem");
// not setting access and secret key
JavaRDD rdd = sc.parallelize(Arrays.asList("hello", "stackoverflow"));
rdd.saveAsTextFile("s3a://my-bucket/my-file-txt");

gives:

Exception in thread "main" com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 403, AWS Service: Amazon S3, AWS Request ID: 25DF243A166206A0, AWS Error Code: null, AWS Error Message: Forbidden, S3 Extended Request ID: Ki5SP11xQEMKb0m0UZNXb4FhfWLMdbehbknQ+jeZuO/wjhwurjkFoEYVfrQfW1KIq435Lo9jPkw=  
    at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)  
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)  
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:976)  
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:956)  
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:892)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1426)
    at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:130)

Is there a way to force Spark to use default credential provider chain instead of relying on access and secret key?

Write to S3 from Spark without access and secret keys

Answers (1)

Related Questions