Reputation: 31
I am trying to write a Spark data-frame to AWS S3 bucket using Pyspark and getting an exceptions that the encryption method specified is not supported. The bucket has server-side encryption setup.
I'm having the following packages run from spark-default.conf: spark.jars.packages com.amazonaws:aws-java-sdk:1.9.5, org.apache.hadoop:hadoop-aws:3.2.0
Reviewed this existing thread: Doesn't Spark/Hadoop support SSE-KMS encryption on AWS S3 and it mentions that the above version should support SSE-KMS encryption.
I also included the core-site.xml to have the property 'fs.s3a.server-side-encryption-algorithm' set to 'SSE-KMS'
But, I still get the error. Please note that for buckets without the SSE-KMS, this works fine.
Error Message: AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Error Code: InvalidArgument, AWS Error Message: The encryption method specified is not supported
Upvotes: 2
Views: 6931
Reputation: 31
Thanks for all your inputs Steve. Adding the following to the spark-defaults.conf fixed our issue:
spark.hadoop.fs.s3a.server-side-encryption-algorithm AES256
Upvotes: 1
Reputation: 13430
Hadoop 3.2.0 absolutely supports SSE-KMS, so whatever the problem is it'll be with: SSE-KMS key used in the config, your permissions to access it, or some other quirk (e.g. the key isn't in the same region as the bucket).
But: that release is built against AWS 1.11.375 mvnrepo hadoop-aws. Mixing JARs is generally doomed. That may be a factor, it may not.
You got a 400 back from the far end, meaning something was rejected there.
Recommend
Note: it doesn't matter at all what the fs.s3a.encryption settings are when you are trying to read data -S3 knows the KMS key used and will automatically use it to decrypt, if you have the permissions. That's a good way to check you have read permissions on a key
Upvotes: 1