Golo Roden
Golo Roden

Reputation: 150972

Writing to disk gets slower over time with Node.js

I am trying to write large files (500 MByte) to disk using Node.js. I figured out that while the first few files are being written in a few seconds (typically, 3 to 5 seconds), starting with around the 10th file things tend to get slower (and it won't recover).

The setup consists of a server that accepts files via a TCP/IP socket and pipes them to disk:

var fs = require('fs'),
    net = require('net'),
    path = require('path');

var counter = 0;

net.createServer(function (socket) {
  console.time('received');
  console.time('written');

  counter++;

  var filename = path.join(__dirname, 'temp' + counter + '.tmp');
  var file = fs.createWriteStream(filename, { encoding: 'utf8' });

  socket.pipe(file);

  socket.once('end', function () {
    console.timeEnd('written');
  });

  file.once('finish', function () {
    console.timeEnd('received');
  });
}).listen(3000);

I send the data from the terminal using nc in the following way:

$ while [ true ]; do `cat input.tmp | nc localhost 3000`; done

Running

$ time cat input.tmp > /dev/null

has shown that cat reads the files in always the same time. If I replace the output path of the Node.js script to /dev/null the writing, too, always happens in the same time.

So the problem apparently is related to actually writing to disk.

I first thought that it may be a problem with concurrent reads and writes, but the problem even persists when I run

$ while [ true ]; do `cat input.tmp | nc localhost 3000; sleep 5`; done

If I run the same test with an even larger file (twice as large, i.e. 1 GByte) then it takes around half the time until the writing gets slower.

UPDATE

I've changed my Node.js application to write everything to a single file, which gets appended on and on and on… the server now looks like this:

var fs = require('fs'),
    net = require('net'),
    path = require('path');

var filename = path.join(__dirname, 'temp.tmp');
var file = fs.createWriteStream(filename, { encoding: 'utf8' });

net.createServer(function (socket) {
  console.time('received');
  console.time('written');

  socket.pipe(file, { end: false });

  socket.once('end', function () {
    console.timeEnd('written');
  });
}).listen(3000);

Now the problem is gone, so apparently it has to do with writing multiple files in a row. At least I can't see where I am writing multiple files at the same time (am I?), so I can not think of a reason why this should happen. Especially the usage of sleep 5 should make sure that the OS has really written everything to disk.

UPDATE 2

I originally tested using Node.js 0.10.32. As soon as I switch to 0.11.13, the effect is not gone completely, but it takes way more time until it happens. In the original setup, the problem arose at around 10 cycles, with Node.js 0.11.13 it happens earliest on cycle 30.

Any idea what might cause this behavior?

Upvotes: 3

Views: 2316

Answers (1)

xShirase
xShirase

Reputation: 12389

I've had a similar issue a while back. There is a maximum of concurrent I/O operations possible so Node will start writing as many files at the same time as it can, and the rest will be queued until a slot is free.

file 1 |-----------------------------------|
file 2  |-----------------------------------|
file 3   |-----------------------------------|
file 4                                      |-------------------------------------|

The above is just an example but it shows the principle, writing 4 files in this case will take twice as long than writing only 3 files.

Upvotes: 3

Related Questions