user1691915
user1691915

Reputation: 63

Issues with s4cmd

I have about 50GB data to upload to S3 bucket but s3cmd is unreliable and very slow. the sync doesn't seem to work because of the timeout error.

I switched to s4cmd it works great, multi threaded and fast.

     s4cmd dsync -r -t 1000 --ignore-empty-source forms/ s3://bucket/J/M/

The above uploads a set of files and then throws error - [Thread Failure] Unable to read data from source: /home/ubuntu/path to file The source file contains an image file so there is nothing wrong there.

s4cmd has options like --retry for the command to restart if it fails but this also doesn't seem to work. If you have come across a solution to prevent this error, Please share.

Upvotes: 0

Views: 3638

Answers (1)

user1691915
user1691915

Reputation: 63

I got it working fine. I'm glad my file uploads are super fast. If you are still using s3cmd I highly recommend you switch to s4cmd!

Download and install s4cmd. Find s4cmd.py and replace with the following -

    @log_calls
  def read_file_chunk(self, source, pos, chunk):
    '''Read local file cunks'''
    data = None
    with open(source, 'rb') as f:
      f.seek(pos)
      data = f.read(chunk)
    if not f:
      raise Failure('Unable to read data from source: %s' % source)
    return StringIO(data)

then call s4cmd.py into the upload command like

/pathtodir/s4cmd.py dsync -r forms/ s3://bucket/J/M/

Upvotes: 1

Related Questions