ronald mcdolittle
ronald mcdolittle

Reputation: 567

Remove all files in s3 bucket but not all directories

So I wrote a nifty script to delete everything in an s3 bucket, which I've shared below. Only issue is this script delete directories too. I have a data ingestion bucket that needs to keep its directories, but have its files removed everyday. Anyone have ideas on how I can modify it to not delete directories?

I've seen CLI commands that can do something similar to this but I'd need to figure out how to run those from a lambda function

import json
import boto3

def empty_s3_bucket():
  client = boto3.client(
    's3'
   )
bucketNames = ["bucket1","bucket2"]
for bucketName in bucketNames:
    response = client.list_objects_v2(Bucket=bucketName)
  if 'Contents' in response:
    for item in response['Contents']:
      print('deleting file', item['Key'])
      client.delete_object(Bucket=bucketName, Key=item['Key'])
      while response['KeyCount'] == 1000:
        response = client.list_objects_v2(
         Bucket=S3_BUCKET,
         StartAfter=response['Contents'][0]['Key'],
       )
      for item in response['Contents']:
        print('deleting file', item['Key'])
        client.delete_object(Bucket=S3_BUCKET, Key=item['Key'])

def lambda_handler(event, context):
  empty_s3_bucket()
  return {
    'statusCode': 200,
    'body': json.dumps('Buckets cleared!')
  }

Upvotes: 0

Views: 1616

Answers (2)

Andriy Ivaneyko
Andriy Ivaneyko

Reputation: 22021

There are no such concepts as directories or files in s3. Everything is an object here, so there are no good way to differentiate. You may think about restructuring things stored in the bucket, so one bucket store objects which are actually folders and another will store files in the root. Or you can place in files in specific "folder" and delete files only inside of it.

However, In most cases the directory will ends with \ (for created in the console or / for others), so you have to add an if check before performing deleting:

if item['Key'].endswith('\\') or  item['Key'].endswith('/'):
    # Omit deleting directory
    continue
else:
    print('deleting file', item['Key'])
    .....

Upvotes: 3

Daniel Roseman
Daniel Roseman

Reputation: 599460

Directories don't really exist in S3. A key like "foo/bar/baz" is just that, "foo" and "bar" don't exist separately. So if you need to preserve "directories", you need to keep some files with those directories as part of their keys.

Upvotes: 1

Related Questions