Narendar Reddy M
Narendar Reddy M

Reputation: 1607

Long running jobs with AWS Gateway - Lambda - RDS

My Architecture is AWS Gateway - Lambda - RDS I has take data from customers as xlsx files.

I have 2 steps

1.Process the file and store into transaction tables.

2.After user approval Process data to Active tables.

My Question is for large files,Due to time limitations(AWS gateway 30 Secs, Lambda 15 Mins) and synchronous process, I'm Unable to complete the processes. Please suggest me any service in AWS can do long running Jobs which need to interact continuously with RDS, To Compare File data Against system data.

Upvotes: 2

Views: 1743

Answers (2)

gusto2
gusto2

Reputation: 12085

1.Process the file and store into transaction tables.
My Question is for large files

The API Gateway has some payload limits, so it is really not intended for passing larger payload.

A common practice for large content is using an API service to return a presigned url, which can be easily used by the client to upload the content to S3

Then you could use another API resource or S3 events to invoke an action on the object creation (start a process, batch job, ...).

2.After user approval Process data to Active tables.

Now the question is what is the approval process what is "process data", how long does it may take, what resources are needed.

If it's suppose to be a long running process with multiple steps, maybe using a AWS Step Function to coordinate the tasks may by a good option, but that depends on your use cases.

For the long running jobs there are AWS Batch, EMR, (or other compute resource) or even SQS+Lambda if you manage to break down the work to manageable pieces..

Upvotes: 2

deceze
deceze

Reputation: 522522

Generally you'll want to keep all HTTP requests as short as possible, meaning you never do any actual work within an HTTP request. All actual work is to be done in Lambda functions which were invoked via some other means, not API Gateway. The general flow would be something like this:

  1. Client uploads large file, gets a unique id for this upload in return.
  2. Server parses the file into database records.
  3. Server starts processing those database records.
  4. Client can inquire about the status of the processing at any point using the id.

As an example, I have implemented a system like that in this way:

  1. Browser uploads large Excel file, that file is read and written to the database row by row, in batches of hundreds of rows at a time, all tagged with the same random UUID. The file is then discarded naturally when the request ends. This batch reading/writing is optimised so even very large files can be pushed into the database in a matter of seconds.
  2. As a last action before ending the upload process, an SQS message is queued with the UUID of the uploaded records.
  3. The SQS queue invokes a Lambda function which queries all records with that UUID, and puts new SQS messages in the queue with the individual ids of each record, in batches of a few dozen (i.e. messages to "process record 1-20", "21-40" etc.).
  4. The queue invokes another Lambda handler again, which processes each record one by one.

So, the large upload has been broken down into individual records, then individual batches of records, then each record is processed individually, and each individual step doesn't take more than a few seconds and is distributed across separate Lambda instances.

Not only does this allow to process many records in parallel, it also never comes close to the 15 minute Lambda timeout. HTTP requests never come close to the 30 second limit either, since the initial parse-and-store step doesn't take very long, and each subsequent request just needs to look at the database for the status of the processing and doesn't do anything by itself. Depending on your needs, if even the initial parse-and-store step would take longer than 30 seconds, you can upload the file to S3 and then trigger a Lambda function to process it from there.

If any of those steps, even as minimised and broken down as this, take longer than the 15 minutes Lambda timeout, then Lambda isn't the right platform for this job.

Upvotes: 8

Related Questions