Mark Stone
Mark Stone

Reputation: 1

Why append rather than write when using knox / node.js to grab file from Amazon s3

I'm experimenting with the knox module for node.js as a way of managing some small files in an Amazon S3 bucket. Everything works fine stand-alone: I can upload a file, download a file, etc. However, I want to be able to download a file on recurring schedule. When I modify the code to run on an interval, I'm getting the downloaded file appending to the previous instance instead of overwriting.

I'm not sure if I've made a mistake in the file write code or in the knox handling code. I've tried several different write approaches (writeFile, writeStream, etc.) and I've looked at the knox source code. Nothing obvious to me stands out as a problem. Here's the code I'm using:

knox = require('knox');
fs = require('fs');
var downFile = DOWNFILE;
var downTxt = '';
var timer = INTERVAL;
var path = S3PATH + downFile;
setInterval(function() 
{
        var s3client = knox.createClient(
        {
                key: '********************',
                secret: '**********************************',
                bucket: '********'
        });
        s3client.get(path).on('response', function(response)
        {
                response.setEncoding('ascii');
                response.on('data', function(chunk)
                {
                        downTxt += chunk;
                });
                response.on('end', function()
                {
                        fs.writeFileSync(downFile, downTxt, 'ascii');
                });
        }).end();
},
timer);

Upvotes: 0

Views: 695

Answers (1)

loganfsmyth
loganfsmyth

Reputation: 161457

The problem is with your placement of var downTxt = '';. That is the only place you set downTxt to blank, so every time you retrieve more data, you add it to the data that you got in the previous request because you never clear the data from the previous request. The simplest fix is to move that line to just before the setEncoding line.

However, the way you are processing the data is unnecessarily complicated. Try something like this instead. You don't need to recreate the client every time, and setting the encoding will just break things if you are downloading non-text files, and it won't make a difference with text files. Next, you shouldn't manually collect the data, you can immediately start writing it to the file as you receive it. Lastly, since request is a standard stream, you don't need to monitor the 'data' event because you can just use pipe.

var knox = require('knox'),
    fs = require('fs'),
    downFile = DOWNFILE,
    timer = INTERVAL,
    path = S3PATH + downFile,
    s3client = knox.createClient({
        key: '********************',
        secret: '**********************************',
        bucket: '********'
    });

(function downloadFile() {
  var str = fs.createWriteStream(downFile);
  s3client.get(path).pipe(str);
  str.on('close', function() {
    setTimeout(downloadFile, timer);
  });
})();

Upvotes: 1

Related Questions