Reputation: 11
I have built a file processing pipeline where a file once added to an S3 bucket will trigger 5 to 6 different lambdas. In each lambda, I am going to download the file and do some processing on it.
Here is the problem: S3 cost for downloading the file in each lambda is costing 50% of the total S3 costs incurred. Is there any way I can store the file in a cache, download the file from there into the lambdas and once the processing has completed, delete the file from the cache?
Some pointers: Each process must be done simultaneously, can't be combined into a single lambda.
The lambdas are present in the same region as the S3 bucket. In the previous month alone, we had a total of 250 million GET operations on the objects in the S3 bucket.
Upvotes: 0
Views: 51
Reputation: 681
I would suggest the cleaner approach would be to segregate the functionality and use STEP FUNCTION offering from AWS to orchestrate the flow of execution. You can take leverage of differnet states during this orchestration. If you want to use the output of the the child lamba you can get that as well in the next lambda.
Please explore this option rather than waiting for the response from child lambda within another.
Official Documentation: https://aws.amazon.com/step-functions/?step-functions.sort-by=item.additionalFields.postDateTime&step-functions.sort-order=desc
Thanks!!
Upvotes: 0