Reputation: 1607
My Architecture is AWS Gateway - Lambda - RDS I has take data from customers as xlsx files.
I have 2 steps
1.Process the file and store into transaction tables.
2.After user approval Process data to Active tables.
My Question is for large files,Due to time limitations(AWS gateway 30 Secs, Lambda 15 Mins) and synchronous process, I'm Unable to complete the processes. Please suggest me any service in AWS can do long running Jobs which need to interact continuously with RDS, To Compare File data Against system data.
Upvotes: 2
Views: 1743
Reputation: 12085
1.Process the file and store into transaction tables.
My Question is for large files
The API Gateway has some payload limits, so it is really not intended for passing larger payload.
A common practice for large content is using an API service to return a presigned url, which can be easily used by the client to upload the content to S3
Then you could use another API resource or S3 events to invoke an action on the object creation (start a process, batch job, ...).
2.After user approval Process data to Active tables.
Now the question is what is the approval process what is "process data", how long does it may take, what resources are needed.
If it's suppose to be a long running process with multiple steps, maybe using a AWS Step Function to coordinate the tasks may by a good option, but that depends on your use cases.
For the long running jobs there are AWS Batch, EMR, (or other compute resource) or even SQS+Lambda if you manage to break down the work to manageable pieces..
Upvotes: 2
Reputation: 522522
Generally you'll want to keep all HTTP requests as short as possible, meaning you never do any actual work within an HTTP request. All actual work is to be done in Lambda functions which were invoked via some other means, not API Gateway. The general flow would be something like this:
As an example, I have implemented a system like that in this way:
So, the large upload has been broken down into individual records, then individual batches of records, then each record is processed individually, and each individual step doesn't take more than a few seconds and is distributed across separate Lambda instances.
Not only does this allow to process many records in parallel, it also never comes close to the 15 minute Lambda timeout. HTTP requests never come close to the 30 second limit either, since the initial parse-and-store step doesn't take very long, and each subsequent request just needs to look at the database for the status of the processing and doesn't do anything by itself. Depending on your needs, if even the initial parse-and-store step would take longer than 30 seconds, you can upload the file to S3 and then trigger a Lambda function to process it from there.
If any of those steps, even as minimised and broken down as this, take longer than the 15 minutes Lambda timeout, then Lambda isn't the right platform for this job.
Upvotes: 8