Reputation: 1479
I have an EMR cluster that involves steps to write and delete objects on S3 bucket. I have been trying to create a bucket policy in the S3 bucket that denies deleting access to all principals except for the EMR role and the instance profile. Below is my policy.
{
"Version": "2008-10-17",
"Id": "ExamplePolicyId123458",
"Statement": [
{
"Sid": "ExampleStmtSid12345678",
"Effect": "Deny",
"Principal": "*",
"Action": [
"s3:DeleteBucket",
"s3:DeleteObject*"
],
"Resource": [
"arn:aws:s3:::bucket-name",
"arn:aws:s3:::bucket-name/*"
],
"Condition": {
"StringNotLike": {
"aws:userId": [
"AROAI3FK4OGNWXLHB7IXM:*", #EMR Role Id
"AROAISVF3UYNPH33RYIZ6:*", # Instance Profile Role ID
"AIPAIDBGE7J475ON6BAEU" # Instance Profile ID
]
}
}
}
]
}
As I found somewhere, it is not possible to use wildcard entries to specify every Role session in the "NotPrincipal" section so I have used the condition of aws:userId to match.
Whenever I run the EMR step without the bucket policy, the step completes successfully. But when I add the policy to bucket and re run, the step fails with following error.
diagnostics: User class threw exception:
org.apache.hadoop.fs.s3a.AWSS3IOException: delete on s3://vr-dump/metadata/test:
com.amazonaws.services.s3.model.MultiObjectDeleteException: One or more objects could not be deleted
(Service: null; Status Code: 200; Error Code: null; Request ID: 9FC4797479021CEE; S3 Extended Request ID: QWit1wER1s70BJb90H/0zLu4yW5oI5M4Je5aK8STjCYkkhZNVWDAyUlS4uHW5uXYIdWo27nHTak=), S3 Extended Request ID: QWit1wER1s70BJb90H/0zLu4yW5oI5M4Je5aK8STjCYkkhZNVWDAyUlS4uHW5uXYIdWo27nHTak=: One or more objects could not be deleted (Service: null; Status Code: 200; Error Code: null; Request ID: 9FC4797479021CEE; S3 Extended Request ID: QWit1wER1s70BJb90H/0zLu4yW5oI5M4Je5aK8STjCYkkhZNVWDAyUlS4uHW5uXYIdWo27nHTak=)
What is the problem here? Is this related to EMR Spark Configuration or the bucket policy?
Upvotes: 0
Views: 1170
Reputation: 1086
Assuming these role ids are correct (they start in AROA so they have a valid format) I believe you also need the aws account number on the policy. For example:
{
"Version": "2008-10-17",
"Id": "ExamplePolicyId123458",
"Statement": [
{
"Sid": "ExampleStmtSid12345678",
"Effect": "Deny",
"Principal": "*",
"Action": [
"s3:DeleteBucket",
"s3:DeleteObject*"
],
"Resource": [
"arn:aws:s3:::vr-dump",
"arn:aws:s3:::vr-dump/*"
],
"Condition": {
"StringNotLike": {
"aws:userId": [
"AROAI3FK4OGNWXLHB7IXM:*", #EMR Role Id
"AROAISVF3UYNPH33RYIZ6:*", # Instance Profile Role ID
"AIPAIDBGE7J475ON6BAEU", # Instance Profile ID
"1234567890" # Your AWS Account Number
]
}
}
}
]
}
Upvotes: 1