Invoking lambda many times in parallel seems to execute in concurrent batches of 50, not with the higher specified reserved concurrency

Question

TL;DR For a batch of 500 lambdas executed in parallel, I'm observing them being executed in concurrent batches of 50, despite specified reserved concurrency of 500. Why is that?

Hi,

I'm new to AWS lambda and having trouble understanding the concurrency behaviour I'm seeing.

I am invoking a lambda function "calc-group" [from the AWS web interface or CLI], which invokes another lambda function "calc-number" 500 times in parallel, where the latter has specified reserved concurrency of 500. [The lambdas all run and the calculation results are all fine].

"calc-number" takes about 1s to execute, but "calc-group" takes 10s to execute. The Concurrent executions chart suggests I'm getting a concurrency of only 50, consistent with 10x timing I'm seeing. [Note: a more detailed implementation of "calc-number", not shown here, also gave evidence that only 50 lambda execution contexts start with each handling 10 requests sequentially].

I'm using Promise.all on promises for synchronous lambda.invoke calls.

I have read https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html and https://docs.aws.amazon.com/lambda/latest/dg/invocation-scaling.html but don't understand what is happening.

Below is a much stripped down example isolating the behaviour. [Note: I know the memory here is much higher than needed, but it makes the timing more stable and the original code is CPU bound benefiting from this setting].

I would much appreciate any suggestions for how I could get all 500 executing in parallel...

Many thanks!

EDIT: simplified the code having read more on error handling in async node.js handlers + minor tidy-up

EDIT: FYI, if I call calc-group repeatedly in parallel, the concurrent executions of calc-number increase proportionately. e.g. if I call calc-group 5 times in parallel (I tested from CLI), I then see 250 concurrent executions of calc-number, although 2500 requests are made and executed. (Beyond 10 parallel requests of calc-group, requests start getting rejected). So it seems there is some other cap/constraint of 50, perhaps related to where the lambda calls originate from? Is there any documentation on that or way to increase that?

Lambda #1, calc-group

Runtime: Node.js 12.x
Memory(MB): 2048
Timeout: 0 min 15 sec
Role that allows calling calc-number

const AWS = require('aws-sdk');
const lambda = new AWS.Lambda();

exports.handler = async (event) => {

    const n = 500;

    const promises = [];
    for (let x = 1; x <= n; ++x) {
        promises.push(
            lambda.invoke({
                FunctionName: "calc-number",
                Payload: JSON.stringify({x})
            }).promise()
        );
    }

    const data = await Promise.all(promises);

    const results = data.map(d => {
        const payload = JSON.parse(d["Payload"]);
        return payload["result"]
    });

    const sum = results.reduce((a, x) => a + x, 0);

    return { sum };
};

Lambda #2, calc-number

Runtime: Node.js 12.x
Memory(MB): 2048
Timeout: 0 min 3 sec
Reserve concurrency: 500

const util = require('util');
const sleep = util.promisify(setTimeout);

exports.handler = async (event) => {

    const x = event["x"] || 0;

    const result = x * x;

    await sleep(1000);
    
    return { result };
};

Michael - sqlbot · Accepted Answer

Your calling code is limited to 50 SDK requests in parallel because you don't appear to have changed it from the default.

When using the default of https, the SDK takes the maxSockets value from the globalAgent. If the maxSockets value is not defined or is Infinity, the SDK assumes a maxSockets value of 50.

https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/node-configuring-maxsockets.html

Building and signing each API request takes CPU time so fixing this might not take you as far as you would like in a single Node process, but it will remove the barrier at 50.

Invoking lambda many times in parallel seems to execute in concurrent batches of 50, not with the higher specified reserved concurrency

Answers (1)

Related Questions