Reputation: 1505
I know, it is sounds strange, but something is wrong with my AWS SNS =)
I have lambda function, which is sending messages to AWS SNS. Also I have several SQS as subscriptions for my SNS. Also, I have dead queues for SNS and SQS. And turned on logging (100%) for SNS (delivery and errors).
In most cases, my architecture is working as expected - Lambda is sending messages to SNS
But sometimes something is gong wrong between Lambda and SNS, because:
{'MessageId': '292af724-XXXc49658c0', 'SequenceNumber': '10000000000000000551',
'ResponseMetadata': {'RequestId': 'ba126582-XXX8f2',
'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'ba126582-XXX18f2',
'content-type': 'text/xml', 'content-length': '352',
'date': 'Thu, 29 Apr 2021 13:00:28 GMT'}, 'RetryAttempts': 0}}
So, my question is - how it is possible? And how I can fix it?
REMARK - I am using FIFO SNS / SQS
Upvotes: 1
Views: 3776
Reputation: 11
I know this question is a little old, but I wanted to answer just in case you or others are facing this issue and are mystified.
Anarki's and DazedAndConfused's answers are both valid points. One angle you should look at is in your code. If your process flow is:
Lambda 1 > SNS > SQS > Lambda 2
Your Lambda 2 function must process the SQS Records in a Loop. Even if Lambda 1 published multiple SNS events, some messages can be batched and will be processed in Lambda 2.
With that being said, it's very possible that you have a return statement in your Lambda 2 loop. In this case, after processing Record[0], your Lambda 2 will terminate, effectively skipping all subsequent Records.
Using Python as an Example
For instance, this:
def lambda_handler(event, context):
print('Event data is: ' + str(event))
for record in event['Records']:
print(record['messageId'])
return {
"statusCode": 200,
"body": "Success!"
}
...is a lot different than this:
def lambda_handler(event, context):
print('Event data is: ' + str(event))
for record in event['Records']:
print(record['messageId'])
return {
"statusCode": 200,
"body": "Success!"
}
I, having the same confusion and questions as you, unfortunately know and have learned the hard way.
Upvotes: 1
Reputation: 111
Check that the SQS trigger on your Lambda has a batch size of 1. If your output Lambda is designed to handle exactly 1 request at a time, several items on the queue can be popped off in a group, and give the illusion of being lost. If your Lambda is short and fast, this may be desirable... you just need to be aware that it happens.
I have a slightly different setup to yours, but I thought it would be worth sharing regardless:
So technically you can fix things by changing step 3 or 4 depending on what works for you. The easiest way to change the batch size is to create a new trigger. Although it can be done via the CLI, I don't know the command.
Upvotes: 0
Reputation: 115
Is it possible you're just seeing the result of deduplication? https://docs.aws.amazon.com/sns/latest/dg/fifo-message-dedup.html
If you use the same deduplication ID or if you have content deduplication switched on then you won't be able to deliver the same message within a 5 minute period.
SNS/SQS have such epic durability that it would be almost impossible to randomly lose messages unless you're processing billions per hour.
Upvotes: 1