How to write a spark rdd to S3 using server side encryption

Question

I am trying to write an RDD into S3 with server side encryption. Following is my piece of code.

val sparkConf = new SparkConf().
  setMaster("local[*]").
  setAppName("aws-encryption")
val sc = new SparkContext(sparkConf)
sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", AWS_ACCESS_KEY)
sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", AWS_SECRET_KEY)
sc.hadoopConfiguration.setBoolean("fs.s3n.sse.enabled", true)
sc.hadoopConfiguration.set("fs.s3n.enableServerSideEncryption", "true")
sc.hadoopConfiguration.setBoolean("fs.s3n.enableServerSideEncryption", true)
sc.hadoopConfiguration.set("fs.s3n.sse", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3n.serverSideEncryptionAlgorithm", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3n.server-side-encryption-algorithm", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3n.sse.kms.keyId", KMS_ID)
sc.hadoopConfiguration.set("fs.s3n.serverSideEncryptionKey", KMS_ID)

val rdd = sc.parallelize(Seq("one", "two", "three", "four"))
rdd.saveAsTextFile(s"s3n://$bucket/$objKey")

This code is writing RDD on S3 but without encryption. [I have checked properties of the written object and it shows server-side encrypted is "no".] Am I skipping anything here or using any property incorrectly?

Any suggestion would be appreciated.

P.S. I have set same properties with different name, reason being I am not sure when to use which name for e.g.

sc.hadoopConfiguration.setBoolean("fs.s3n.sse.enabled", true)
sc.hadoopConfiguration.set("fs.s3n.enableServerSideEncryption", "true")
sc.hadoopConfiguration.setBoolean("fs.s3n.enableServerSideEncryption", true)

Thank you.

How to write a spark rdd to S3 using server side encryption

Answers (1)

Related Questions