Reputation: 451
There is a list of tables that are copied from Aurora to a S3 bucket in csv format.
For every S3 PUT event a lambda function is triggered, which processes the corresponding csv file.
If I have 50 csv files, how could I track that all of them have been successfully processed?
The conceptual solution would be having a list of the 50 csv files, each of them associated with a lambda execution id, and when every function finishes it updates the corresponding entry in that file.
When all files are processed correctly a trigger is fired and sends an SNS message.
But I don't really know which tools or what would be the best way to implement a solution like this.
Thanks for your time.
Upvotes: 4
Views: 1453
Reputation: 1
You can create a cloud watch alarm for your Lambda function which can send you the alert on slack/email/SMS using SNS topic.
Upvotes: 0
Reputation: 238071
There are few ways of doing this. One way would involve a second lambda (called L2) and a SQS.
In this solution, your first lambda (L1) would be triggered by the S3 events and process the csv files. The L1 would also publish a message to an SQS queue upon complication of csv processing with the metadata of the process file.
The SQS queue would trigger the L2. The only job of L2 would be to check if all the files have been processed, and if yes, send an SNS notification to you.
Exact details of "check if all the files have been processed" are application specific and depend how you mark each csv file as process. You could store the csv complication metadata in DynamoDB, or in S3 as you may be doing now.
To eliminate concurrency issues of L2, you could limit it concurrency to 1, so the SQS messages are processed by only one function (not several L2 functions in parallel).
You could extend the above solution by second SQS, so called dead letter queue (DLQ), which would hold information about failed csv processes. This way your L2 would also be able to determine if something has failed, based on the DLQ.
Extra links:
Upvotes: 3