CR93
CR93

Reputation: 35

S3 Multipart upload in Chunks

I am trying to upload a file from a url into my s3 in chunks, my goal is to have python-logo.png in this example below stored on s3 in chunks image.000 , image.001 , image.002 etc. i have the below code but i am getting error ValueError: Fileobj must implement read can some one point me out to what i am doing wrong? please not the actual data i am trying to upload is much larger, this image file is just for example.

import threading
import requests
import boto3
from boto3.s3.transfer import TransferConfig

bucket_name = "bucket-name-here"

def multi_part_upload_with_s3():
    # Multipart upload
    config = TransferConfig(multipart_threshold=100000, max_concurrency=80,
                            multipart_chunksize=100000, use_threads=True)
    url = "https://www.python.org/static/img/python-logo.png"
    r = requests.get(url, stream=True)
    session = boto3.Session()
    s3 = session.resource('s3')
    key = 'brokenup/image.'
    bucket = s3.Bucket(bucket_name)
    num = 0
    for chunk in r.iter_content(100000):
        file = key+str(num).zfill(3)
        num += 1
        bucket.upload_fileobj(chunk, file, Config=config, Callback=None)

Upvotes: 1

Views: 7438

Answers (1)

Anon Coward
Anon Coward

Reputation: 10827

The documentation for upload_fileobj states:

Upload a file-like object to S3.

The file-like object must be in binary mode.

In other words, you need a binary file object, not a byte array. The easiest way to get there is to wrap your byte array in a BytesIO object:

from io import BytesIO
...
        bucket.upload_fileobj(BytesIO(chunk), file, Config=config, Callback=None)

Upvotes: 2

Related Questions