mattc-7
mattc-7

Reputation: 442

How to trigger AWS Events Rule only when a specific file (key) gets written to an S3 Bucket

I am trying to make an AWS Event (in CloudWatch or EventBridge) that triggers the run of an AWS Step Function when a specific file is put into an S3 Bucket.

My event pattern for the Rule is shown below:

{
  "source": [
    "aws.s3"
  ],
  "detail-type": [
    "AWS API Call via CloudTrail"
  ],
  "detail": {
    "eventSource": [
      "s3.amazonaws.com"
    ],
    "eventName": [
      "PutObject"
    ],
    "requestParameters": {
      "bucketName": [
        "bucketname"
      ],
      "key": [
        "date={{TODAYS DATE}}/_SUCCESS"
      ]
    }
  }
}

Optimally I would like to have the key element pointing to a path where TODAYS DATE represents the current date and _SUCCCESS is an empty file printed to the directory by the my job once it has successfully completed (e.g. if today was 10/31/2019 the full bucket path to check would be bucketname/date=20191031/_SUCCESS). The end goal is to have the Event Rule triggering a Step Function which controls a number of other daily jobs that can only run once the first job that outputs the _SUCCESS file to the bucket has completed successfully.

Preferably I would like to have the key check for the _SUCCESS file using that day's current date. However, if there is no good way to deal with the dates I should also be able to make something work if there is a way to trigger the Rule once when a new directory is put into the bucket (e.g. trigger when directory date=XXXXXX is created). I just cannot have the trigger activate each time any new file is put into the bucket as the initial job will create a number of output files in the date=XXXXXX directory which are used as input for the following jobs.

It would also be very helpful to be able to create this Rule via AWS CloudFormation, so if CloudFormation has any way to deal with these issues that would be great.

Thank you in advance for any help, it is greatly appreciated.

Upvotes: 4

Views: 4765

Answers (1)

Matus Dubrava
Matus Dubrava

Reputation: 14462

I am not sure whether I understand what you are trying to achieve here but why don't you just subscribe lambda function to the bucket where your files are being stored (subscribe it put event), do whatever kind of checks you want to perform programatically inside of that lambda function and if all the conditions are met, just call the mentioned step function from within the lambda function.

And if any of the condition is not met then simply don't launch the step function.

Here is how you can subscribe lambda function to S3 put event (via web console).

  1. go to S3
  2. pick your bucket
  3. go to Properties tab
  4. select Events
  5. check PUT event
  6. under Send to, pick Lambda Function
  7. pick existing lambda function (you need to create that lambda function)

How to access properties such as bucket name, object key and timestamp of the event from within the lambda function. (using Python)

def handler_name(event, context): 
    // get bucket name
    print(event['Records'][0]['s3']['bucket']['name'])

    // get object key
    print(event['Records'][0]['s3']['object']['key'])

    // get event timestamp
    print(event['Records'][0]['eventTime'])

    return 0

Here is complete event object (S3 event object that is) for reference.

{
  "Records": [
    {
      "eventVersion": "2.1",
      "eventSource": "aws:s3",
      "awsRegion": "us-east-2",
      "eventTime": "2019-09-03T19:37:27.192Z",
      "eventName": "ObjectCreated:Put",
      "userIdentity": {
        "principalId": "AWS:AIDAINPONIXQXHT3IKHL2"
      },
      "requestParameters": {
        "sourceIPAddress": "205.255.255.255"
      },
      "responseElements": {
        "x-amz-request-id": "D82B88E5F771F645",
        "x-amz-id-2": "vlR7PnpV2Ce81l0PRw6jlUpck7Jo5ZsQjryTjKlc5aLWGVHPZLj5NeC6qMa0emYBDXOo6QBU0Wo="
      },
      "s3": {
        "s3SchemaVersion": "1.0",
        "configurationId": "828aa6fc-f7b5-4305-8584-487c791949c1",
        "bucket": {
          "name": "lambda-artifacts-deafc19498e3f2df",
          "ownerIdentity": {
            "principalId": "A3I5XTEXAMAI3E"
          },
          "arn": "arn:aws:s3:::lambda-artifacts-deafc19498e3f2df"
        },
        "object": {
          "key": "b21b84d653bb07b05b1e6b33684dc11b",
          "size": 1305107,
          "eTag": "b21b84d653bb07b05b1e6b33684dc11b",
          "sequencer": "0C0F6F405D6ED209E1"
        }
      }
    }

  ]
}

How to execute Step Function from within Lambda Function (using Python + Boto3)

import boto3

sfn_client = boto3.client('stepfunctions')

def handler_name(event, context): 

    response = sfn_client.start_execution(
        stateMachineArn='string',
        name='string',
        input='string'
    )

    return 0

where stateMachineArn is the Amazon Resource Name (ARN) of the state machine to execute, name (optional) is name of the execution and input is the string that contains the JSON input data for the execution.

Upvotes: 4

Related Questions