Andrey Novosad
Andrey Novosad

Reputation: 461

Is boto3 client thread-safe

Is boto3 low level client for S3 thread-safe? Documentation is not explicit about it.

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#client

A similar issue is discussed in Github

https://github.com/boto/botocore/issues/1246

But still there is no answer from maintainers.

Upvotes: 46

Views: 49965

Answers (5)

sam2426679
sam2426679

Reputation: 3857

This was answered by the boto team on May 19, 2021. See source docs here.

Resource instances are not thread safe and should not be shared across threads or processes. These special classes contain additional meta data that cannot be shared. It's recommended to create a new Resource for each thread or process:

import boto3
import boto3.session
import threading

class MyTask(threading.Thread):
    def run(self):
        # Here we create a new session per thread
        session = boto3.session.Session()

        # Next, we create a resource client using our thread's session object
        s3 = session.resource('s3')

        # Put your thread-safe code here

In the example above, each thread would have its own Boto3 session and its own instance of the S3 resource. This is a good idea because resources contain shared data when loaded and calling actions, accessing properties, or manually loading or reloading the resource can modify this data.

Upvotes: 5

Skam
Skam

Reputation: 7808

If you take a look at the Multithreading/Processing documentation for boto3 you can see that they recommend one client per session as there is shared data between instance that can be mutated by individual threads.

It also looks like there's an open GitHub issue for this exact question. https://github.com/boto/botocore/issues/1246

Upvotes: 34

sakell
sakell

Reputation: 141

You can successfully create multiple threads, but you have to instantiate a new session per thread/process and thereby can asynchronously download from an S3 bucket for example.

An example below:

import concurrent.futures
import boto3
import json


files = ["path-to-file.json", "path-to-file2.json"] 

def download_from_s3(file_path):
    # setup a new session
    sess = boto3.session.Session()
    client = sess.client("s3")
    # download a file
    obj = client.get_object(Bucket="<your-bucket>", Key=file_path)
    resp = json.loads(obj["Body"].read())
    return resp

with concurrent.futures.ThreadPoolExecutor() as executor:
     executor.map(download_from_s3, files)

Upvotes: 3

AlexB
AlexB

Reputation: 361

From documentation:

Low-level clients are thread safe. When using a low-level client, it is recommended to instantiate your client then pass that client object to each of your threads.

Instantiation of the client is not thread safe while an instance is. To make things work in a multi-threaded environment, put instantiation in a global Lock like this:

boto3_client_lock = threading.Lock()

def create_client():
    with boto3_client_lock:
        return boto3.client('s3', aws_access_key_id='your key id', aws_secret_access_key='your access key')

Upvotes: 26

Pawel
Pawel

Reputation: 219

I recently tried using the single boto client instance using concurrent.futures.ThreadPoolExecutor. I run into exceptions coming from boto. I assume the boto client is not thread safe in this case.

The exception I got

  File "xxx/python3.7/site-packages/boto3/session.py", line 263, in client
    aws_session_token=aws_session_token, config=config)
  File "xxx/python3.7/site-packages/botocore/session.py", line 827, in create_client
    endpoint_resolver = self._get_internal_component('endpoint_resolver')
  File "xxx/python3.7/site-packages/botocore/session.py", line 694, in _get_internal_component
    return self._internal_components.get_component(name)
  File "xxx/python3.7/site-packages/botocore/session.py", line 906, in get_component
    del self._deferred[name]

Upvotes: 16

Related Questions