Reputation: 4768
I have an event-driven data pipeline on AWS which processes millions of files. each file in my s3 bucket triggers a lambda. the lambda processes the data in the file and dumps the processed data to an s3 bucket, which in turn triggers another lambda etc.
Downstream of my pipeline I have a lambda which creates an Athena database and table. This lambda is triggered as soon as an object is dumped under the appropriate key of my s3 bucket. It's enough to call this lambda that creates my Athena database and table only once.
how can I avoid letting my labda being triggered multiple times?
Upvotes: 0
Views: 170
Reputation: 6552
This is your existing flow:
Your step 3 is not even driven, you are enforcing an event.
I suggest you the following flow:
Only two steps, the lambda that process the file should use Athena SDK and check if the desired table already exists, and only if not, then you call the Lambda that creates the Athena table. The delivery S3 should not trigger the lambda for Athena.
Upvotes: 1