Reputation: 35
I am trying to upload a file from a url into my s3 in chunks, my goal is to have python-logo.png in this example below stored on s3 in chunks image.000 , image.001 , image.002 etc. i have the below code but i am getting error ValueError: Fileobj must implement read
can some one point me out to what i am doing wrong? please not the actual data i am trying to upload is much larger, this image file is just for example.
import threading
import requests
import boto3
from boto3.s3.transfer import TransferConfig
bucket_name = "bucket-name-here"
def multi_part_upload_with_s3():
# Multipart upload
config = TransferConfig(multipart_threshold=100000, max_concurrency=80,
multipart_chunksize=100000, use_threads=True)
url = "https://www.python.org/static/img/python-logo.png"
r = requests.get(url, stream=True)
session = boto3.Session()
s3 = session.resource('s3')
key = 'brokenup/image.'
bucket = s3.Bucket(bucket_name)
num = 0
for chunk in r.iter_content(100000):
file = key+str(num).zfill(3)
num += 1
bucket.upload_fileobj(chunk, file, Config=config, Callback=None)
Upvotes: 1
Views: 7438
Reputation: 10827
The documentation for upload_fileobj
states:
Upload a file-like object to S3.
The file-like object must be in binary mode.
In other words, you need a binary file object, not a byte array. The easiest way to get there is to wrap your byte array in a BytesIO
object:
from io import BytesIO
...
bucket.upload_fileobj(BytesIO(chunk), file, Config=config, Callback=None)
Upvotes: 2