Usage of concurrent.futures.ThreadPoolExecutor throws timeout exception always in aws lambda

I have the following code in aws lambda to get response from an API until the status is complete. I have used the ThreadPoolExecutor from concurrent.futures.

Here is the sample code.

import requests
import json
import concurrent.futures

def copy_url(headers,data):
   collectionStatus = 'INITIATED'
   retries = 0
   print(" The data to be copied is ",data)
   while (collectionStatus != 'COMPLETED' or retries <= 50):
       r = requests.post(
              url=URL,
              headers=headers,
              data=json.dumps(data))
       final_status= r.json().get('status').pop().get('status')
       retries += 1
       print(" The collection status is",final_status)


with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    future = executor.submit(copy_url,headers,data)
    return_value = future.result()

I had already implemented this using regular threads in python. However, since I wanted a return value from the thread tried implementing this. Though this works perfectly in pycharm, it always throws a timeout error in aws lambda.

Could someone please explain why this happens only in aws-lambda?

Note : I have already tried increasing the lambda timeout value. This happens only when threadpoolexecutor is implemented. When I comment out that code it works fine.Also it works fine with the regular python thread implementation

Upvotes: 3

Answers (2)

balq

Reputation: 21

The question was related to multithread execution, but the AWS documentation listed on the answer is related to multiprocessing, they are different implementations.

Multiprocess will open a new child process to execute the operation
Multithread will create a new thread on the same process to execute the operation.

More information on this answer: Multiprocessing vs Threading Python

Upvotes: 0

DineshKumar

Reputation: 1739

Finally, I changed the implementation to listening to a SQS trigger rather than waiting for the response from an API (The API is handled by a different component and response will take a significant amount of time)

Looks like we should avoid using parallel processing tasks with python in aws lambda.

From the AWS docs:

The multiprocessing module that comes with Python lets you run multiple processes in parallel. Due to the Lambda execution environment not having /dev/shm (shared memory for processes) support, you can’t use multiprocessing.Queue or multiprocessing.Pool.

If multiprocessing ought to be used, only PIPE is supported.

Upvotes: 2

Usage of concurrent.futures.ThreadPoolExecutor throws timeout exception always in aws lambda

Answers (2)

Related Questions