ChumiestBucket
ChumiestBucket

Reputation: 1063

SQS triggers Lambda function over and over again, why?

My SQS is supposed to trigger a Lambda function once. It seems to trigger it repeatedly and I cannot find the cause.

Here's how the system works:

  1. File is uploaded to S3 bucket (we'll call it uploaded-docs).

  2. Uploading a file to uploaded-docs/input/ triggers an event on the bucket, called FileUploaded.

  3. Event FileUploaded triggers a Lambda function called ProcessUploadedFile.

  4. ProcessUploadeFile calls Textract to analyze a document. The output of said Textract process is published (when finished) to an SNS Topic called TextractComplete.

  5. TextractComplete has one subscription, the SQS called TextractOutputQueue.

  6. TextractOutputQueue triggers a Lambda function called GetOutput. It's only supposed to run once for each file uploaded

I've noticed that when a file is uploaded, GetOutput is called over and over on the file and Task timed out after 3 seconds (Textract's GetDocumentTextDetection command called via boto3) can be found in the logs until it finally exceeds.

I also noticed that when a new file is uploaded, all the files in the bucket are called again in this process.

Some hypotheses:

  1. The SQS queue items aren't being "consumed", such that they still exist in the queue after calling their respective Lambda functions, and because they still exist, they keep invoking the function.

  2. The Lambda GetOutput is stuck in some sort of retry loop, and every time it times out, it calls the entire function again. This doesn't explain why the logs will be dead until I upload another document, and all the previously-uploaded documents enter the loop again somehow.

Upvotes: 2

Views: 2192

Answers (1)

ChumiestBucket
ChumiestBucket

Reputation: 1063

Thanks to all that commented, it was a very simple fix. Just increase the timeout of GetOutput to 10 seconds and the issue stopped.

Upvotes: 2

Related Questions