Reputation: 7023
I have a python
aws-lambda
function that takes in an image (geospatial raster) on S3, computes a subset and dumps it to another bucket on S3. The function only runs on demand, there's no schedule.
So basically it has 3 arguments:
Is as trying to invoke the function from my local PC in a loop, going over 1000+ sources:
#...
# create lambda client from botocore session
lambda_client = session.create_client("lambda")
for file in input_files:
# create payload body with source, destination, window
# ...
response = lambda_client.invoke(
FunctionName="foobar",
InvocationType='Event',
Payload=json.dumps(payload)
)
assert response['ResponseMetadata']["HTTPStatusCode"] == 202
The function is set to have a maximum memory of 1024MB
and a timeout of 15s
, which seems to be just fine.
When I run the invocations, they take quite some time (which is fine) but for some reason, I don't get many concurrent invocations. I haven't set any limits, nor do I see any reason why it would get throttled.
I can see in the metrics dashboard, that I don't get more than 8 concurrent executions:
A couple of Qs:
How can I run this function with a higher concurrency?
Is there a better way to implement this kind of function?
Notes:
Upvotes: 4
Views: 826
Reputation: 238051
Based on the comments.
The Duration
metric shows that the average execution time of the lambda function is about 0.5 seconds. Since concurrency is about 8, this means that the for
loop in the question makes about 8 requests within this time period.
Since the execution time is so short, a possible solution to improve the time efficiency is to batch the requests, so that multiple payloads are send to the function in a one API call. This not only reduces the number of API calls to AWS, but also extends the execution time of the function.
The alternative is to perform invoke
API calls in parallel, rather then one-by-one as it is current done in the for
loop.
Upvotes: 2