Oscar
Oscar

Reputation: 211

Reading stdout of child process unbuffered

I'm trying to read the output of a Python script launched by Node.js as it arrives. However, I only get access to the data once the process has finished.

var proc, args;

args = [
    './bin/build_map.py',
    '--min_lon',
    opts.sw.lng,
    '--max_lon',
    opts.ne.lng,
    '--min_lat',
    opts.sw.lat,
    '--max_lat',
    opts.ne.lat,
    '--city',
    opts.city
];

proc = spawn('python', args);

proc.stdout.on('data', function (buf) {
    console.log(buf.toString());
    socket.emit('map-creation-response', buf.toString());
});

If I launch the process with { stdio : 'inherit' } I can see the output as it happens directly in the console. But doing something like process.stdout.on('data', ...) will not work.

How do I make sure I can read the output from the child process as it arrives and direct it somewhere else?

Upvotes: 8

Views: 3048

Answers (2)

Nadav Har'El
Nadav Har'El

Reputation: 13731

The process doing the buffering, because it knows the terminal was redirected and not really going to the terminal, is python. You can easily tell Python not to do this buffering: Just run "python -u" instead of "python". Should be easy as that.

Upvotes: 3

ctt
ctt

Reputation: 1435

When a process is spawned by child_process.spawn(), the streams connected to the child process's standard output and standard error are actually unbuffered on the Nodejs side. To illustrate this, consider the following program:

const spawn = require('child_process').spawn;

var proc = spawn('bash', [
  '-c',
  'for i in $(seq 1 80); do echo -n .; sleep 1; done'
]);

proc.stdout
.on('data', function (b) {
  process.stdout.write(b);
})
.on('close', function () {
  process.stdout.write("\n");
});

This program runs bash and has it emit . characters every second for 80 seconds, while consuming this child process's standard output via data events. You should notice that the dots are emitted by the Node program every second, helping to confirm that buffering does not occur on the Nodejs side.

Also, as explained in the Nodejs documentation on child_process:

By default, pipes for stdin, stdout and stderr are established between the parent Node.js process and the spawned child. It is possible to stream data through these pipes in a non-blocking way. Note, however, that some programs use line-buffered I/O internally. While that does not affect Node.js, it can mean that data sent to the child process may not be immediately consumed.

You may want to confirm that your Python program does not buffer its output. If you feel you're emitting data from your Python program as separate distinct writes to standard output, consider running sys.stdout.flush() following each write to suggest that Python should actually write data instead of trying to buffer it.

Update: In this commit that passage from the Nodejs documentation was removed for the following reason:

doc: remove confusing note about child process stdio

It’s not obvious what the paragraph is supposed to say. In particular, whether and what kind of buffering mechanism a process uses for its stdio streams does not affect that, in general, no guarantees can be made about when it consumes data that was sent to it.

This suggests that there could be buffering at play before the Nodejs process receives data. In spite of this, care should be taken to ensure that processes within your control upstream of Nodejs are not buffering their output.

Upvotes: 0

Related Questions