Reputation: 1063
My SQS is supposed to trigger a Lambda function once. It seems to trigger it repeatedly and I cannot find the cause.
Here's how the system works:
File is uploaded to S3 bucket (we'll call it uploaded-docs
).
Uploading a file to uploaded-docs/input/
triggers an event on the bucket, called FileUploaded
.
Event FileUploaded
triggers a Lambda function called ProcessUploadedFile
.
ProcessUploadeFile
calls Textract to analyze a document. The output of said Textract process is published (when finished) to an SNS Topic called TextractComplete
.
TextractComplete
has one subscription, the SQS called TextractOutputQueue
.
TextractOutputQueue
triggers a Lambda function called GetOutput
. It's only supposed to run once for each file uploaded
I've noticed that when a file is uploaded, GetOutput
is called over and over on the file and Task timed out after 3 seconds
(Textract's GetDocumentTextDetection
command called via boto3
) can be found in the logs until it finally exceeds.
I also noticed that when a new file is uploaded, all the files in the bucket are called again in this process.
Some hypotheses:
The SQS queue items aren't being "consumed", such that they still exist in the queue after calling their respective Lambda functions, and because they still exist, they keep invoking the function.
The Lambda GetOutput
is stuck in some sort of retry loop, and every time it times out, it calls the entire function again. This doesn't explain why the logs will be dead until I upload another document, and all the previously-uploaded documents enter the loop again somehow.
Upvotes: 2
Views: 2192
Reputation: 1063
Thanks to all that commented, it was a very simple fix. Just increase the timeout of GetOutput
to 10 seconds and the issue stopped.
Upvotes: 2