zwol
zwol

Reputation: 140876

Buffered writes to stdout in node.js

In node.js the process.stdout stream is documented to behave "synchronously" which, among other things, means that every call to stdout.write causes an immediate write system call -- there's no buffering. For instance

import { stdout } from 'process';

for (let i = 1; i <= 1000; i++) {
    stdout.write(`line ${i}\n`);
}
stdout.end();

makes 1000 write system calls. This is Not What You Want when you are writing a traditional Unix data-emitting utility. It's possible to bypass process.stdout and create a separate Writable stream that sinks to file descriptor 1, e.g.

import { stdout } from 'process';
import { createWriteStream } from 'fs';
let ostream = createWriteStream("/ignored", { fd: stdout.fd });
for (let i = 1; i <= 1000; i++) {
    ostream.write(`line ${i}\n`);
}
ostream.end();

makes just one system call. However, bypasses like this are dangerous -- after the ostream.end call, file descriptor 1 is closed but the process.stdout object does not know that.

Is there an official way to get buffered output when using process.stdout? Ideal in my book would be something like C's setvbuf.

Upvotes: 5

Views: 1842

Answers (3)

zwol
zwol

Reputation: 140876

After giving this a bunch more thought, the ideal way to set up to use process.stdout safely and efficiently for bulk data output would be to dup the OS-level STDOUT_FILENO, wrap a normal fs.WriteStream around the duplicate, and then dup2 STDERR_FILENO over STDOUT_FILENO, which would make all of the global console object's methods, and process.stdout, write to stderr instead of stdout. (As far as I can tell there is no other way to do that.) The code generating the bulk data would then write to the normal WriteStream instead of to process.stdout.

Or, in code:

import * as fs from "fs/promises";
import process from "process";
async function prepare_buffered_stdout(options) {
    options ??= {};
    const stdout_copy = await fs.dup(process.stdout.fd);
    await fs.dup2(process.stderr.fd, process.stdout.fd);
    return fs.fdopen(stdout_copy, "w").createWriteStream(options);
}

There is, unfortunately, the minor problem that the functions fs.dup, fs.dup2, and fs.fdopen do not exist. I have filed a feature request for them to be added.

Upvotes: -1

niry
niry

Reputation: 3308

Use process.stdout.cork()

The writable.cork() method forces all written data to be buffered in memory. The buffered data will be flushed when either the stream.uncork() or stream.end() methods are called.

Note: this will cork also console.log().

process.stdout.cork();
for (let i = 1; i <= 1000; i++) {
    process.stdout.write(`line ${i}\n`);
}
process.stdout.uncork();

Or, if you want to flush the buffer every time the event loop is reached, you can override write():

const { stdout } = process;
const { write } = stdout;

stdout.write = function() {
  if (this.writableCorked == 0) {
    this.cork();
    process.nextTick(() => this.uncork());
  }
  write.apply(this, arguments);
}

for (let i = 1; i <= 1000; i++) {
  process.stdout.write(`line ${i}\n`);
}

If you are worried about the buffer getting to big, you might want to also check stdout.writableNeedDrain.

strace -f -c -e trace=write,writev results:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.000006           1         4           write
  0.00    0.000000           0         1           writev
------ ----------- ----------- --------- --------- ----------------
100.00    0.000006                     5           total

Tested using node v16.13.2

Upvotes: 2

kevintechie
kevintechie

Reputation: 1521

You can call .end() on stdout too. I'm not sure if this makes it safer.

import { promisify } from 'util';
import stream from 'stream';
import { stdout } from 'process';
import { createWriteStream } from 'fs';

const finished = promisify(stream.finished);

let ostream = createWriteStream("/ignored", { fd: stdout.fd });
for (let i = 1; i <= 1000; i++) {
    ostream.write(`line ${i}\n`);
}
ostream.end();

await finished(ostream).then(()=> {
  console.log('Not quite done.');
  stdout.end()
  console.log('Not gonna see this.');
});

EDIT: I just read in the Node docs that on POSIX systems, if you use pipes to send data to stdout, that it is async. So the following code should work:

import { stdout } from 'process';
import { Readable } from 'stream';
const data = new Readable()

data._read = () => {};

data.pipe(stdout);

for (let i = 1; i <= 1000; i++) {
  data.push(`line ${i}\n`);
}

Upvotes: 0

Related Questions