igonejack
igonejack

Reputation: 2532

What would happens on buffer level when an input stream piping to multi output streams?

I'm reading stream document and looking for buffering behavior description about streams at https://nodejs.org/api/stream.html#stream_buffering

The document seems not mentioning about what would happen to inputStream buffer(or buffers?), when piping to multiple outputs as different output have different consuming speeds:

Does the the readableStream keep a dedicated buffer for every output when piping multiple outputs?

Does the outputs keep same speed when consuming or the faster would end earlier?

const input = fs.createReadStream('img.jpg');
const target1 = input.pipe(fs.createWriteStream('target1.jpg'));
const target2 = input.pipe(fs.createWriteStream('target2.jpg'));

Upvotes: 3

Views: 289

Answers (1)

Michał Karpacki
Michał Karpacki

Reputation: 2658

TL;DR: The short answer is - the slower target stream controls the flow rate.

So first of all let's see what happens on read side.

const input = fs.createReadStream('img.jpg');

When you instantiate the input stream it is created in paused mode and scheduled for reading (no reading is done synchronously, so it will not access the file yet). The stream has highWaterMark set to something like 16384 and currently has a buffer of 0 bytes.

const target1 = input.pipe(fs.createWriteStream('target1.jpg'));
const target2 = input.pipe(fs.createWriteStream('target2.jpg'));

Now when you actually pipe it to the writable streams the flowing mode is set by adding the on('data') event handler in the pipe method implementation - see the source.

When this is done I assume there's no more program to run, so node starts the actual reading and runs the planned code in the handler above which simply writes any data that comes through.

The flow control happens when any of the targets has more data to write than its highWaterMark which cayses the write operation to return false. The reading is then stopped by calling pause here in the code. Two lines above this you'll see that state.awaitDrain is incremented.

Now the read stream is paused again and the writable streams are writing bytes to disk - at some point the buffer level again goes below the highWaterMark. At this point a drain event is fired which executes this line and, after all awaited drains have been called, resumes the flow. This is done by checking if the decremented awaitDrain property has reached zero, which means that all the awaited drain events have been called.

In the above case the faster of the two streams could return a falsy value while writing, but it would definitely drain as the first. If it wasn't for the awaitDrain the faster stream would resume the data flow and that would cause a possible buffer overflow in the slower of the two.

Upvotes: 4

Related Questions