TheDataEngineer
TheDataEngineer

Reputation: 41

How to filter s3 new object created events by file name using python/Boto3?

I have a bucket called:

On bucket_1 i have s3 events set up for all new objects created within that bucket. What i would like to do is set up a rule so that if the file that is dropped is prefixed with "data". Then it triggers a lambda function that will process the data within the file.

What i'm struggling with is how to filter for a particular file. So far the code i have within python is this:

def handler(event, context):
    s3 = boto3.resource('s3')
    message = json.loads(event['Records'][0]['Sns']['Message'])
    print("JSON: " + json.dumps(message))


    return message

This lambda is triggered when an event is added to my sns topic, but i just want to filter specifically for an object creation with a prefix of "data".

Has anyone done something similar to this before? for clarity this is the workflow of the job that i would like to happen:

1. file added to bucket_1
2. notification sent to sns topic [TRIGGERS below python code]
3. python filters for object created notification and file with prefix of "data*" [Triggers] python below]
4. python fetched data from s3 location cleans it up and places it into table.

So specifically i am looking on how exactly to set up step 3.

Upvotes: 1

Views: 1828

Answers (1)

TheDataEngineer
TheDataEngineer

Reputation: 41

After looking around i've found that this isn't actually done in python but rather when configuring events on the s3 bucket itself.

You go to this area: s3 > choose your bucket > properties > event notifications > create notification

Once you are there you just select all create objects events and you can even specify the prefix and suffix of a file name. Then you will only receive notifications through to that topic for what you've specified.

Hopefully this helps someone at some point!

Upvotes: 3

Related Questions