demental
demental

Reputation: 1484

Aws::S3 put_object very slow with aws-sdk-ruby

We generate PDF files on a background worker, on a Rails app hosted at Heroku, once generated they are uploaded to Amazon S3. Both Heroku app and S3 bucket are located in eu-west-1 zone.

We are experiencing veeery slow upload, altghough very basic ans small files. Look at this example:

Aws.config.update({
  region: 'eu-west-1',
  credentials: Aws::Credentials.new(ENV['S3_USER_KEY'], ENV['S3_USER_SECRET'])
})

S3_BUCKET = Aws::S3::Resource.new.bucket(ENV['S3_PRIVATE_BUCKET'])

file = Tempfile.new(["testfile", ".pdf"], encoding: "ascii-8bit").tap do |file|
  file.write("a"*5000)
end

Benchmark.bm do |x|
  x.report { S3_BUCKET.put_object(key: "testfile.pdf", body: file) }
end

   user       system     total      real
   0.020000   0.040000   0.060000   ( 40.499553)

I think I cannot make a simpler example, so sending a file with 5000 characters takes 40 seconds to be uploaded to S3 from a Heroku one-off instance.

Please note that I tested on both my (domestic) internet connection and Heroku instance, results are almost similar. On the other side I'm using ForkLift.app as a GUI to browse my buckets, and uploading a file is almost instantaneous.

I've been browsing through aws-sdk documentation and I couldn't see anything to explain such a slow upload

Upvotes: 2

Views: 2791

Answers (5)

coconuts
coconuts

Reputation: 1

My solution:

Benchmark.bm do |x|
  file.seek(0) # add this
  x.report { S3_BUCKET.put_object(key: "testfile.pdf", body: file) }
end

# user       system     total      real
# 0.003022   0.000740   0.003762   (0.018425)

Upvotes: 0

Nakachan
Nakachan

Reputation: 1

I think you missing content_type in put_object.

S3_BUCKET.put_object(key: "testfile.pdf", body: file, content_type: "application/pdf")

i have same test performance with this, it'll increase from 20s to 2ms for a pdf file upload.

Hope this help.

Upvotes: 0

Nicholas Mueller
Nicholas Mueller

Reputation: 43

I ran into a similar issue when creating an Aws::S3::Object and using the method upload_file. If I passed in a TempFile object, uploading even a small file (~5KB) took ~40 seconds. However, passing in TempFile.path was stupendously faster (less than 1 second).

You may have needed to use the AWS::S3::Bucket method put_object for your own reasons, but put_object seems to only accept a String or IO, not a File or TempFile path. If you could refactor to create an AWS::S3::Object and use upload_file you could use this workaround.

require 'aws-sdk-s3'

s3_resource = Aws::S3::Resource.new(region: 'us-east-2')

file = Tempfile.new(["testfile", ".pdf"], encoding: "ascii-8bit").tap do |file|
  file.write("a"*5000)
end

Benchmark.bm do |x|
  x.report {
    obj = s3_resource.bucket('mybucket').object("testfile-as-object.pdf")

    #passing the TempFile object is quite slow
    obj.upload_file(file)
  }
end

#       user     system      total        real
#   0.010359   0.006335   0.016694 ( 41.175544)

Benchmark.bm do |x|
  x.report {
    obj = s3_resource.bucket('mybucket').object("testfile-as-path.pdf")

    #passing the TempFile object's path is massively faster than passing the TempFile object itself
    obj.upload_file(file.path)
  }
end

#       user     system      total        real
#   0.004573   0.002032   0.006605 (  0.398605)

Upvotes: 0

izambl
izambl

Reputation: 659

It seems to be a problem with put_object and TempFile

Try passing the file to IO first

new_file = IO.read(file)
S3_BUCKET.put_object(key: "testfile.pdf", body: new_file)

Upvotes: 1

demental
demental

Reputation: 1484

It seems like the AwsSdk is the culprit. I tested with other ways to upload the same file :

With Aws CLI

(I was on a cellphone connection, so the network was really slow, and I did not take the time to install/configure aws CLI on a Heroku Dyno)

Benchmark.bm do |x|
  x.report { `aws s3 cp #{file.path} s3://#{ ENV['S3_BUCKET']}/testfile.pdf` }
end

0.000000   0.000000   0.510000 (  2.486112)

With Fog AWS

This was run from a Heroku Dyno.

connection = Fog::Storage.new({
  :provider                 => 'AWS',
  :aws_access_key_id        => ENV['S3_USER_KEY'],
  :aws_secret_access_key    => ENV['S3_USER_SECRET'],
  region: "eu-west-1"
})

directory = connection.directories.new(key: ENV["S3_BUCKET"], region: "eu-west-1")

Benchmark.bmb do |x|
  x.report do
    directory.files.create(
      :key    => 'test-with-fog.pdf',
      :body   => file,
    )
  end
end

       user     system      total        real
   0.010000   0.010000   0.020000 (  0.050712)

I will stick to the latest as a workaround. Still, I did not find the reason causing such slowliness with aws-sdk.

Upvotes: 0

Related Questions