Trevor
Trevor

Reputation: 1035

How can I check if a message is about to pass the MessageRetentionPeriod?

I have an app that uses SQS to queue jobs. Ideally I want every job to be completed, but some are going to fail. Sometimes re-running them will work, and sometimes they will just keep failing until the retention period is reached. . I want to keep failing jobs in the queue as long as possible, to give them the maximum possible chance of success, so I don't want to set a maxReceiveCount. But I do want to detect when a job reaches the MessageRetentionPeriod limit, as I need to send an alert when a job fails completely. Currently I have the max retention at 14 days, but some jobs will still not be completed by then.

Is there a way to detect when a job is about to expire, and from there send it to a deadletter queue for additional processing?

Upvotes: 2

Views: 3296

Answers (1)

Anthony Neace
Anthony Neace

Reputation: 26031

Before you follow my advice below and assuming I've done the math for periods correctly, you will be better off enabling a redrive policy on the queue if you check for messages less often than every 20 minutes and 9 seconds.

SQS's "redrive policy" allows you to migrates messages to a dead letter queue after a threshold number of receives. The maximum receives that AWS allows for this is 1000, and over 14 days that works out to about 20 minutes per receive. (For simplicity, that is assuming that your job never misses an attempt to read queue messages. You can tweak the numbers to build in a tolerance for failure.)

If you check more often than that, you'll want to implement the solution below.


You can check for this "cutoff date" (when the job is about to expire) as you process the messages, and send messages to the deadletter queue if they've passed the time when you've given up on them.

Pseudocode to add to your current routine:

  • Call GetQueueAttributes to get the count, in seconds, of your queue's Message Retention Period.
  • Call ReceiveMessage to pull messages off of the queue. Make sure to explicitly request that the SentTimestamp is visible.
  • Foreach message,
    • Find your message's expiration time by adding the message retention period to the sent timestamp.
    • Create your cutoff date by subtracting your desired amount of time from the message's expiration time.
    • Compare the cutoff date with the current time. If the cutoff date has passed:
      • Call SendMessage to send your message to the Dead Letter queue.
      • Call DeleteMessage to remove your message from the queue you are processing.
    • If the cutoff date has not passed:
      • Process the job as normal.

Here's an example implementation in Powershell:

$queueUrl = "https://sqs.amazonaws.com/0000/my-queue"
$deadLetterQueueUrl = "https://sqs.amazonaws.com/0000/deadletter"

# Get the message retention period in seconds
$messageRetentionPeriod = (Get-SQSQueueAttribute -AttributeNames "MessageRetentionPeriod" -QueueUrl $queueUrl).Attributes.MessageRetentionPeriod

# Receive messages from our queue.  
$queueMessages = @(receive-sqsmessage -QueueUrl $queueUrl -WaitTimeSeconds 5 -AttributeNames SentTimestamp)

foreach($message in $queueMessages)
{
    # The sent timestamp is in epoch time.
    $sentTimestampUnix = $message.Attributes.SentTimestamp

    # For powershell, we need to do some quick conversion to get a DateTime.
    $sentTimestamp = ([datetime]'1970-01-01 00:00:00').AddMilliseconds($sentTimestampUnix)

    # Get the expiration time by adding the retention period to the sent time.
    $expirationTime = $sentTimestamp.AddDays($messageRetentionPeriod / 86400 )

    # I want my cutoff date to be one hour before the expiration time.
    $cutoffDate = $expirationTime.AddHours(-1)

    # Check if the cutoff date has passed.
    if((Get-Date) -ge $cutoffDate)
    {
        # Cutoff Date has passed, move to deadletter queue

        Send-SQSMessage -QueueUrl $deadLetterQueueUrl -MessageBody $message.Body

        remove-sqsmessage -QueueUrl $queueUrl -ReceiptHandle $message.ReceiptHandle -Force
    }
    else
    {
        # Cutoff Date has not passed. Retry job?
    }
}

This will add some overhead to every message you process. This also assumes that your message handler will receive the message inbetween the cutoff time and the expiration time. Make sure that your application is polling often enough to receive the message.

Upvotes: 2

Related Questions