Reputation: 1019
I have a function which which changes the storage class of an S3 object. The function works except tags are not being copied
def to_deep_archive(s3_key):
'''
Set the storage to DEEP_ARCHIVE
Copied from https://stackoverflow.com/questions/39309846/how-to-change-storage-class-of-existing-key-via-boto3
'''
s3 = boto3.client('s3')
# Source data to move to DEEP_ARCHIVE
copy_source = {
'Bucket' : BUCKET,
'Key' : s3_key
}
# TODO : encryption
# convert to DEEP_ARCHIVE by copying
s3.copy(
copy_source,
BUCKET,
s3_key,
ExtraArgs = {
'StorageClass' : 'DEEP_ARCHIVE',
'MetadataDirective' : 'COPY',
'TaggingDirective' : 'COPY',
'ServerSideEncryption' : 'AES256'
}
)
There was no exception thrown. My role policy looks something like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:DeleteObjectTagging",
"s3:GetObject",
"s3:GetObjectTagging",
"s3:PutObjectTagging",
"s3:ReplicateTags"
],
"Resource": "arn:aws:s3:::my_bucket/*"
}
]
}
My bucket policy looks like this:
{
"Sid": "Stmt1492757001621",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::my_account:role/my_role"
},
"Action": [
"s3:GetObject",
"s3:GetObjectTagging",
"s3:PutObjectTagging",
"s3:DeleteObjectTagging",
"s3:ListBucket",
"s3:ReplicateTags"
],
"Resource": [
"arn:aws:s3:::my_bucket/*",
"arn:aws:s3:::my_bucket"
]
}
Is there something else I need to do?
Upvotes: 3
Views: 3870
Reputation: 101
I've found an interesting discrepancy in how s3.copy()
handles the Tagging
and TaggingDirective
extra arguments.
As per the source code of s3transfer/copies.py, which seems to perform the s3.copy()
, the underlying implementation depends on the object's size. If it exceeds a certain multipart_threshold
, then it's uploaded using s3_client.upload_part_copy()
. If it's below the threshold, it's uploaded using the ordinary s3_client.copy_object()
, which has a file size limit of 5GB. From the copy_object docs:
You create a copy of your object up to 5 GB in size in a single atomic action using this API. However, to copy an object greater than 5 GB, you must use the multipart upload Upload Part - Copy (UploadPartCopy) API.
Unfortunately, the Tagging
and TaggingDirective
arguments are supported by copy_object()
but not by upload_part_copy()
. See the latter's documentation here. Therefore, TaggingDirective
is explicitly blacklisted as an argument to exclude when submitting the upload_part_copy()
request, but the same is not performed for copy_object()
, where the argument, along with Tagging
, is provided.
In summary, both tagging ExtraArgs
seem as though they should work for small files, but not for large ones. Therefore, I'm reverting to performing a subsequent put_object_tagging()
call after the copy, which is unfortunate due to the additional API call and the delay between copy and tagging.
Upvotes: 4
Reputation: 1
You can use copy object and TaggingDirective='COPY'
to copy s3 objects with tags.
response = s3.copy_object(
Bucket='destination bucket',
CopySource={'Bucket': 'source bucket',
'Key': object["Key"]},
Key=object["Key"],
TaggingDirective='COPY'
)
Upvotes: 0