Fanny Zhingre
Fanny Zhingre

Reputation: 183

How to delete multiple files and specific pattern in S3 boto3

Can Python delete specific multiple files in S3?

I want to delete multiple files with specific extensions.

This script removes all files.

These are the various specific files that I want to delete:

XXX.tar.gz
XXX.txt

** Current code: ** (all files deleted)

import boto3

accesskey = "123"
secretkey = "123"
region = "ap-northeast-1"

s3 = boto3.resource ('s3', aws_access_key_id = accesskey, aws_secret_access_key = secretkey, region_name = region)

bucket = s3.Bucket ('test')
files = [os.key for os in bucket.objects.filter (Prefix = "myfolder / test /")]
tar_files = [file to file in files if file.endswith ('tar.gz')]

#print (f'All files: {files} ')
#print (f'CSV files: {csv_files} ')

objects_to_delete = s3.meta.client.list_objects (Bucket = "test", Prefix = "myfolder / test /")

delete_keys = {'Objects': []}
delete_keys ['Objects'] = [{'Key': tar_files} for tar_files in [obj ['Key'] for obj in objects_to_delete.get ('Content', [])]]

s3.meta.client.delete_objects (Bucket = "test", Delete = delete_keys)

If anyone knows, please let me know.

Upvotes: 6

Views: 13741

Answers (3)

Serhii Kushchenko
Serhii Kushchenko

Reputation: 928

My code:

import boto3

# NOTE maybe you'll have to supply additional parameters: region, access key, and access key ID
s3_client = boto3.client("s3")
s3_resource = boto3.resource("s3")
bucket_name = "XYZ-data-storage-ABC"

bucket = s3_resource.Bucket(bucket_name)
files = [item.key for item in bucket.objects.all()]

# NOTE Here if filtering by condition, selection of files for deletion
files_to_delete = [file for file in files if file.endswith("_comp.png")]

print(f"All files: {len(files)}")
total_count = len(files_to_delete)
print(f"files_to_delete: {total_count}")
print(f"{files_to_delete[0]=}")

counter = 0
for file_to_delete in files_to_delete:
    counter = counter + 1
    print(f"Deleting file {file_to_delete} - {counter} of {total_count}...")
    s3_client.delete_object(Bucket=bucket_name , Key=file_to_delete)

Upvotes: 0

Alon Lavian
Alon Lavian

Reputation: 1389

  1. Iterate over your S3 buckets
  2. For each bucket, iterate over the files
  3. Delete the requested file types
import boto3
s3 = boto3.resource('s3')
   
   for bucket in s3.meta.client.list_buckets()['Buckets']:
       for count, obj in enumerate(s3.Bucket(bucket['Name']).objects.filter()):
           if obj.key.endswith('.tar.gz') or obj.key.endswith('.txt'):
               print("{}: deleting: {} from: {}".format(count, obj.key, bucket['Name']))
               s3.meta.client.delete_object(Bucket=bucket['Name'], Key=obj.key)

I used this method since, for large buckets creating the complete list and then deleting it can take some time. This way, you see the progress.

Upvotes: 1

John Rotenstein
John Rotenstein

Reputation: 269101

Presuming that you want to delete *.tar.gz and *.txt files from the given bucket and prefix, this would work:

import boto3

s3_resource = boto3.resource('s3')

bucket = s3_resource.Bucket('my-bucket')
objects = bucket.objects.filter(Prefix = 'myfolder/')

objects_to_delete = [{'Key': o.key} for o in objects if o.key.endswith('.tar.gz') or o.key.endswith('.txt')]

if len(objects_to_delete):
    s3_resource.meta.client.delete_objects(Bucket='my-bucket', Delete={'Objects': objects_to_delete})

Upvotes: 12

Related Questions