Rohan
Rohan

Reputation: 611

IOError in Boto3 download_file

Background

I am using the following Boto3 code to download file from S3.

for record in event['Records']:
    bucket = record['s3']['bucket']['name']
    key = record['s3']['object']['key']
    print (key)
    if key.find('/') < 0 :
    if len(key) > 4 and key[-5:].lower() == '.json': //File is uploaded outside any folder

        download_path = '/tmp/{}{}'.format(uuid.uuid4(), key)
    else:
        download_path = '/tmp/{}/{}'.format(uuid.uuid4(), key)//File is uploaded inside a folder

If a new file is uploaded in S3 bucket, this code is triggered and that newly uploaded file is downloaded by this code.

This code works fine when uploaded outside any folder.

However, when I upload a file inside a directory, IO error happens. Here is a dump of the IO error I am encountering.

[Errno 2] No such file or directory: /tmp/316bbe85-fa21-463b-b965-9c12b0327f5d/test1/customer1.json.586ea9b8: IOError

test1 is the directory inside my S3 bucket where customer1.json is uploaded.

Query

Any thoughts on how to resolve this error?

Upvotes: 15

Views: 15605

Answers (4)

Apoorv Maheshwari
Apoorv Maheshwari

Reputation: 11

The problem with your code is that download_path is wrong. Whenever you are trying to download any file which is under a directory in your s3 bucket, the download path becomes something like:

download_path = /tmp/<uuid><object key name>
where <object key name>  = "<directory name>/<object name>"

This makes the download path as:

download_path = /tmp/<uuid><directory name>/<object key name>

The code will fail because there is no directory exist with uuid-directory name. Your code only allows download of a file under /tmp directory only.

To fix the issue, considering splitting your key while making the download path and you can as well avoid check where the file was uploaded in the bucket. This will just take object file name only in the download path. For example:

for record in event['Records']:
    bucket = record['s3']['bucket']['name']
    key = record['s3']['object']['key']
    print (key) 
    download_path = '/tmp/{}{}'.format(uuid.uuid4(), key.split('/')[-1])

Upvotes: 0

Shailesh
Shailesh

Reputation: 2276

I faced the same issue, and the error message caused a lot of confusion, (the random string extension after the file name). In my case it was caused by the missing directory path, which didn't exist.

Upvotes: 3

Andriy Ivaneyko
Andriy Ivaneyko

Reputation: 22031

Error raised because you attempted to download and save file into directory which not exists. Use os.mkdir prior downloading file to create an directory.

# ...
else:
    item_uuid = str(uuid.uuid4())
    os.mkdir('/tmp/{}'.format(item_uuid))
    download_path = '/tmp/{}/{}'.format(item_uuid, key)  # File is uploaded inside a folder

Note: It's better to use os.path.join() while operating with systems paths. So code above could be rewritten to:

# ...
else:
    item_uuid = str(uuid.uuid4())
    os.mkdir(os.path.join(['tmp', item_uuid]))
    download_path = os.path.join(['tmp', item_uuid, key]))

Also error may be raises because you including '/tmp/' in download path for s3 bucket file, do not include tmp folder as likely it's not exists on s3. Ensure you are on the right way by using that articles:

Upvotes: 11

Rohan
Rohan

Reputation: 611

thanks for helping Andriy Ivaneyko,I found an solution using boto3.

Using this following code i am able to accomplish my task.

for record in event['Records']:
    bucket = record['s3']['bucket']['name']
    key = record['s3']['object']['key']
    fn='/tmp/xyz'
    fp=open(fn,'w')
    response = s3_client.get_object(Bucket=bucket,Key=key)
    contents = response['Body'].read()
    fp.write(contents)
    fp.close()

Upvotes: 1

Related Questions