Mehran
Mehran

Reputation: 16831

Called a lambda function once, it's executed twice

This is more of a concern than a question, but still, has anyone experienced this before? Does anyone know how to prevent it?

I have a lambda function (L1) which calls a second lambda function (L2) all written in NodeJs (runtime: Node.Js 8.10, and aws-sdk should be v2.488.0 - but I'm just pasting that from the documentation). The short story is that L1 is supposed to call L2, and when it does L2 is executed twice! I discovered this by writing logs to CloudWatch and I could see one L1 log and two L2 logs.

Here's a simplified version of L1 and L2.

L1:

const AWS = require('aws-sdk');
const lambda = new AWS.Lambda();

module.exports = {
    handler: async (event, context, callback) => {
        const payload: { rnd: Math.random() };
        const lambdaParams = {
            FunctionName: 'L2',
            Qualifier: `dev`,
            Payload: JSON.stringify(payload),
        };

        console.log(`L1 calling: ${JSON.stringify(payload)}`);
        return await lambda.invoke(lambdaParams).promise();
    },
};

L2:

module.exports = {
    handler: async (event, context, callback) => {
        console.log(`L2 called: ${JSON.stringify(event)}`);
    },
};

In CloudWatch I can see one L1 calling {"rnd": 0.012072353149807702} and two L2 called: {"rnd": 0.012072353149807702}!

BTW, this does not happen all the time. This is part of a step function process which was going to call L1 10k times. My code is written in a way that if L2 is executed twice (per one call), it will fail the whole process (because L2 inserts a record to DB only if it does not exist and fails if it does). So far, I managed to log this behaviour three times. All of them processing the same 10k items, facing the issue at a different iteration each time.

Does anyone have the same experience? Or even better, knows how to make sure one call leads to exactly one execution?

Upvotes: 28

Views: 24696

Answers (8)

Alena  Melnikova
Alena Melnikova

Reputation: 1521

Lambda can be triggered twice if your layer with dependencies is too large. If in Cloudwatch logs you see INIT_REPORT Init Duration: 10000+ ms Phase: init Status: timeout, then you probably need to reduce the size of your zip file. Or implement idempotence.

When Lambda starts execution in the runtime environment, it has 10 seconds for the INIT phase, where it loads dependencies. If it can't finish in 10 sec, then Lambda retries the Init phase, which means it triggers the function again. See the official explanation of what happens during the Init phase.

From my experiments, zip files smaller than 20Mb have Init Duration (see it in the Cloudwatch logs) less than 10 sec and don't cause Lambda to trigger twice.

Upvotes: 2

Albert Baucells
Albert Baucells

Reputation: 59

I just run into the same problem. Calling from browser, lambda runs twice. I realized the second one was from the browser, getting /favicon.ico. I used Chrome browser. I hope it helps.

Upvotes: 4

patelmanishin
patelmanishin

Reputation: 41

I was also facing this issue. And in the case of CODE deployment to production environment. We were checking multiple stages like jira ticket is on "ready to deploy" state or not, On GitHub tickets are mergable or not.

Solutions: For every request call, We started storing ID to s3 bucket and validate before processing request. Below is detailed code example.

Input Event Parameters:

{
    "id": 1234,
    "data": {
        "name": "sonoo",
        "salary": 56000,
        "married": true
    }
}

app.py

import boto3

def lambda_handler(event, context):
    # Extract relevant context data from the event
    event_id = event.get('id')
    event_data = event.get('data')

    # Check if the event has already been processed
    if is_event_processed(event_id):
        return {
            'statusCode': 200,
            'body': 'Event already processed'
        }

    # Perform authorization check if required
    if not is_authorized(event_data):
        return {
            'statusCode': 403,
            'body': 'Unauthorized event execution'
        }

    # Process the event
    process_event(event_data)

    # Update storage with processed event information
    mark_event_as_processed(event_id)

    return {
        'statusCode': 200,
        'body': 'Event processed successfully'
    }

def is_event_processed(event_id):
    # Check if the event ID exists as an object in the S3 bucket
    s3 = boto3.client('s3')
    bucket_name = 'YourBucketName'
    object_key = f'processed_events/{event_id}'
    
    try:
        s3.head_object(Bucket=bucket_name, Key=object_key)
        return True
    except:
        return False

def is_authorized(event_data):
    # Implement your authorization logic here
    # Return True if authorized, False otherwise
    return True

def process_event(event_data):
    # Implement your event processing logic here
    pass

def mark_event_as_processed(event_id):
    # Upload an empty object to the S3 bucket to mark the event as processed
    s3 = boto3.client('s3')
    bucket_name = 'YourBucketName'
    object_key = f'processed_events/{event_id}'

    s3.put_object(Body=b'', Bucket=bucket_name, Key=object_key)

Upvotes: 1

Chetan Kumar
Chetan Kumar

Reputation: 11

Good morning,

I was also facing this issue. And in the case of algorithmic trading, this issue really costs a lot of money as the lambda function was sending two orders of the same script to the exchange resulting in higher broker charges.

A little background to my lambda function. It gets triggered from EventBridge at 03:43 (9:13 IST) and then sleeps till 3:45 (9:15 IST) to execute the order precisely at the opening bell. I kept this 2 min gap in trigger and execution as loading modules takes different times in each instance. Sleep code block in lambda function prior to resolution: -

# Generating sleep to execute the order exactly at 9:15:00
current_time = datetime.now()
current_time = current_time.replace(year=1900, month= 1, day=1)
print("Current time for gap calculation: ", current_time)

target_time = datetime(1900, 1, 1, 3, 44, 59, 996000)
print("Target time for gap calculation: ", target_time)

gap = (target_time - current_time ).total_seconds()
print(f'Gap: {gap}')

if gap > 0 :
    sleep(gap)

Prior to this code block, I am loading modules and after this code block, I am placing an order to exchange.

The first time when function ran it showed:- 2023-07-06T03:43:35.417+00:00 Gap: 84.578606

As per the problem the function ran automatically for 2nd time and Gap is negative as we have already passed the target time of 3:45(9:15 IST) :- 2023-07-06T03:45:09.884+00:00 Gap: -9.88857

So to stop the function to execute further in the second attempt I have inserted an else statement in the sleep code block.

else:
    return 'Second Attempt'

Now my total sleep code block looks like this:-

# Generating sleep to execute the order exactly at 3:45:00 (9:15:00 IST)

current_time = datetime.now()
current_time = current_time.replace(year=1900, month= 1, day=1)
print("Current time for gap calculation: ", current_time)

target_time = datetime(1900, 1, 1, 3, 44, 59, 996000)
print("Target time for gap calculation: ", target_time)

gap = (target_time - current_time ).total_seconds()
print(f'Gap: {gap}')

if gap > 0 :
    sleep(gap)

else:
    return 'Second Attempt'

This way when my lambda function runs 2nd time automatically it finds the gap value to be negative and exits the lambda function with the statement 'Second Attempt'. And it never reaches the real part of sending orders to the exchange after the sleep code block.

I am learning Python, Lambda, and trading. Please forgive me if there is any mistake or shortcoming in my presentation. This has helped to solve this problem in which the lambda function gets executed twice.

Upvotes: 1

In my case, it happened that when calling my second lambda, there was a try that was catching an exception with traceback, this triggered the lambda to retry the call, normally without traceback this does not happen, but when commenting the module it stopped happening.

Also within the try it had a condition that yes or yes it could fail, since it had to query for a resource with boto3, so if it existed there was no problem, but since there was no traceback it forced the general failure, not capturing it as an exception.

Upvotes: 0

Z.Wei
Z.Wei

Reputation: 3938

One case that may cause this is that in your L2 lambda you didn't return anything, which will lead the L1 lambda (the caller) to think there is an error with L2 and so the Retry mechanism is triggered. Try to return something in L2, even simply an "OK".

Upvotes: 7

Azize
Azize

Reputation: 4476

Your lambda function must be idempotent, because it can be called twice in different situations.

https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-idempotent/

https://cloudonaut.io/your-lambda-function-might-execute-twice-deal-with-it/

Upvotes: 19

WaltDe
WaltDe

Reputation: 1832

With 10K lambda invokes it must be experiencing a failure and doing a retry.

From the documentation:

Asynchronous Invocation – Lambda retries function errors twice. If the function doesn't have enough capacity to handle all incoming requests, events may wait in the queue for hours or days to be sent to the function. You can configure a dead-letter queue on the function to capture events that were not successfully processed. For more information, see Asynchronous Invocation.

If this is what is a happening and you setup the dead letter queue you'll be able to isolate the failure event.

You can also use CloudWatch Logs Insights to easily and quickly search for errors messages of the lambda. Once you select the log group this query should help you get started. Just change the time window.

fields @timestamp, @message
| filter @message like /(?i)(Exception|error|fail|5\d\d)/
| sort @timestamp desc
| limit 20

Upvotes: 3

Related Questions