Reputation: 183
Can Python delete specific multiple files in S3?
I want to delete multiple files with specific extensions.
This script removes all files.
These are the various specific files that I want to delete:
XXX.tar.gz
XXX.txt
** Current code: ** (all files deleted)
import boto3
accesskey = "123"
secretkey = "123"
region = "ap-northeast-1"
s3 = boto3.resource ('s3', aws_access_key_id = accesskey, aws_secret_access_key = secretkey, region_name = region)
bucket = s3.Bucket ('test')
files = [os.key for os in bucket.objects.filter (Prefix = "myfolder / test /")]
tar_files = [file to file in files if file.endswith ('tar.gz')]
#print (f'All files: {files} ')
#print (f'CSV files: {csv_files} ')
objects_to_delete = s3.meta.client.list_objects (Bucket = "test", Prefix = "myfolder / test /")
delete_keys = {'Objects': []}
delete_keys ['Objects'] = [{'Key': tar_files} for tar_files in [obj ['Key'] for obj in objects_to_delete.get ('Content', [])]]
s3.meta.client.delete_objects (Bucket = "test", Delete = delete_keys)
If anyone knows, please let me know.
Upvotes: 6
Views: 13741
Reputation: 928
My code:
import boto3
# NOTE maybe you'll have to supply additional parameters: region, access key, and access key ID
s3_client = boto3.client("s3")
s3_resource = boto3.resource("s3")
bucket_name = "XYZ-data-storage-ABC"
bucket = s3_resource.Bucket(bucket_name)
files = [item.key for item in bucket.objects.all()]
# NOTE Here if filtering by condition, selection of files for deletion
files_to_delete = [file for file in files if file.endswith("_comp.png")]
print(f"All files: {len(files)}")
total_count = len(files_to_delete)
print(f"files_to_delete: {total_count}")
print(f"{files_to_delete[0]=}")
counter = 0
for file_to_delete in files_to_delete:
counter = counter + 1
print(f"Deleting file {file_to_delete} - {counter} of {total_count}...")
s3_client.delete_object(Bucket=bucket_name , Key=file_to_delete)
Upvotes: 0
Reputation: 1389
import boto3
s3 = boto3.resource('s3')
for bucket in s3.meta.client.list_buckets()['Buckets']:
for count, obj in enumerate(s3.Bucket(bucket['Name']).objects.filter()):
if obj.key.endswith('.tar.gz') or obj.key.endswith('.txt'):
print("{}: deleting: {} from: {}".format(count, obj.key, bucket['Name']))
s3.meta.client.delete_object(Bucket=bucket['Name'], Key=obj.key)
I used this method since, for large buckets creating the complete list and then deleting it can take some time. This way, you see the progress.
Upvotes: 1
Reputation: 269101
Presuming that you want to delete *.tar.gz
and *.txt
files from the given bucket and prefix, this would work:
import boto3
s3_resource = boto3.resource('s3')
bucket = s3_resource.Bucket('my-bucket')
objects = bucket.objects.filter(Prefix = 'myfolder/')
objects_to_delete = [{'Key': o.key} for o in objects if o.key.endswith('.tar.gz') or o.key.endswith('.txt')]
if len(objects_to_delete):
s3_resource.meta.client.delete_objects(Bucket='my-bucket', Delete={'Objects': objects_to_delete})
Upvotes: 12