Reputation: 503

S3 file upload stream using node js

I am trying to find some solution to stream file on amazon S3 using node js server with requirements:

Don't store temp file on server or in memory. But up-to some limit not complete file, buffering can be used for uploading.
No restriction on uploaded file size.
Don't freeze server till complete file upload because in case of heavy file upload other request's waiting time will unexpectedly increase.

I don't want to use direct file upload from browser because S3 credentials needs to share in that case. One more reason to upload file from node js server is that some authentication may also needs to apply before uploading file.

I tried to achieve this using node-multiparty. But it was not working as expecting. You can see my solution and issue at https://github.com/andrewrk/node-multiparty/issues/49. It works fine for small files but fails for file of size 15MB.

Any solution or alternative ?

Upvotes: 42

Answers (6)

nfroidure

Reputation: 1601

For your information, the v3 SDK were published with a dedicated module to handle that use case : https://www.npmjs.com/package/@aws-sdk/lib-storage

Took me a while to find it.

Upvotes: 8

Johann Philipp Strathausen

Reputation: 5736

You can now use streaming with the official Amazon SDK for nodejs in the section "Uploading a File to an Amazon S3 Bucket" or see their example on GitHub.

What's even more awesome, you finally can do so without knowing the file size in advance. Simply pass the stream as the Body:

var fs = require('fs');
var zlib = require('zlib');

var body = fs.createReadStream('bigfile').pipe(zlib.createGzip());
var s3obj = new AWS.S3({params: {Bucket: 'myBucket', Key: 'myKey'}});
s3obj.upload({Body: body})
  .on('httpUploadProgress', function(evt) { console.log(evt); })
  .send(function(err, data) { console.log(err, data) });

Upvotes: 46

mattdlockyer

Reputation: 7314

If it helps anyone I was able to stream from the client to s3 successfully (without memory or disk storage):

https://gist.github.com/mattlockyer/532291b6194f6d9ca40cb82564db9d2a

The server endpoint assumes req is a stream object, I sent a File object from the client which modern browsers can send as binary data and added file info set in the headers.

const fileUploadStream = (req, res) => {
  //get "body" args from header
  const { id, fn } = JSON.parse(req.get('body'));
  const Key = id + '/' + fn; //upload to s3 folder "id" with filename === fn
  const params = {
    Key,
    Bucket: bucketName, //set somewhere
    Body: req, //req is a stream
  };
  s3.upload(params, (err, data) => {
    if (err) {
      res.send('Error Uploading Data: ' + JSON.stringify(err) + '\n' + JSON.stringify(err.stack));
    } else {
      res.send(Key);
    }
  });
};

Yes putting the file info in the headers breaks convention but if you look at the gist it's much cleaner than anything else I found using streaming libraries or multer, busboy etc...

+1 for pragmatism and thanks to @SalehenRahman for his help.

Upvotes: 1

Harshavardhana

Reputation: 1428

Alternatively you can look at - https://github.com/minio/minio-js. It has minimal set of abstracted API's implementing most commonly used S3 calls.

Here is an example of streaming upload.

$ npm install minio
$ cat >> put-object.js << EOF

var Minio = require('minio')
var fs = require('fs')

// find out your s3 end point here:
// http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region

var s3Client = new Minio({
  url: 'https://<your-s3-endpoint>',
  accessKey: 'YOUR-ACCESSKEYID',
  secretKey: 'YOUR-SECRETACCESSKEY'
})

var outFile = fs.createWriteStream('your_localfile.zip');
var fileStat = Fs.stat(file, function(e, stat) {
  if (e) {
    return console.log(e)
  }
  s3Client.putObject('mybucket', 'hello/remote_file.zip', 'application/octet-stream', stat.size, fileStream, function(e) {
    return console.log(e) // should be null
  })
})
EOF

putObject() here is a fully managed single function call for file sizes over 5MB it automatically does multipart internally. You can resume a failed upload as well and it will start from where its left off by verifying previously upload parts.

Additionally this library is also isomorphic, can be used in browsers as well.

Upvotes: 0

Daveee

Reputation: 49

I'm using the s3-upload-stream module in a working project here.

There is also some good examples from @raynos in his http-framework repository.

Upvotes: 0

Yaroslav Pogrebnyak

Reputation: 1237

Give https://www.npmjs.org/package/streaming-s3 a try.

I used it for uploading several big files in parallel (>500Mb), and it worked very well. It very configurable and also allows you to track uploading statistics. You not need to know total size of the object, and nothing is written on disk.

Upvotes: 2

S3 file upload stream using node js

Answers (6)

Related Questions