Is my understanding of libuv threadpool in node.js correct?

I wrote the following node.js program (node version 6.2.0 on Ubuntu 14.04) to understand more about libuv threadpool in node.js. In the program, I am reading two text files of size 10KB. After the files are read successfully, I am doing some computing intensive task(in the callback).

var log4js = require('log4js');// For logging output with timestamp
var logger = log4js.getLogger();

var fs=require('fs');

fs.readFile('testFile0.txt',function(err,data){//read testFile0.txt
logger.debug('data read of testFile0.txt');

for(var i=0; i<10000; i++)//Computing intensive task. Looping for 10^10 times
{
    for(var j=0; j<10000; j++)
    {
        for(var k=0; k<100; k++)
        {

        }
    }
}
});

fs.readFile('testFile1.txt',function(err,data){//read testFile1.txt
logger.debug('data read of testFile1.txt');

for(var i=0; i<10000; i++)//Computing intensive task. Looping for 10^10 times
{
    for(var j=0; j<10000; j++)
    {
        for(var k=0; k<100; k++)
        {

        }
    }
}
});

As per my understanding of libuv threadpool, the two files should be read immediately and the the time difference between printing of statements "data read of testFile0.txt", "data read of testFile1.txt" should be very less (in milliseconds or a second at most) since the default thread pool size is 4 and only two async requests (file read operation) are there. But, the time difference between printing of statements "data read of testFile0.txt" and "data read of testFile0.txt" is quite large (10 seconds). Can someone explain why the time difference is so large?? Does the computing intensive task being done in the callback contribute to the large time difference??

Upvotes: 1

Views: 250

Answers (1)

saghul
saghul

Reputation: 2010

libuv has a threadpool of size 4 (by default) so that part is correct. Now, let's see how that is actually used.

When some operation is queued in the threadpool, it's run on one of the threads, and then the result is posted to the "main" thread, that being the thread where the loop is running. Results are processed in FIFO style.

In your case, reading the files happens in parallel, but processing the results will be serialized. This means that while bytes are read from the disk in parallel, the callbacks will always run one after another.

You see the delay, because the second callback can only run after the first one is finished, but that takes ~10s, hence the delay.

One way to make this trully parallel would be to do the computation in the threadpool itself, though you'd need an addon for that which uses uv_queue_work or use the child_process mdoule.

Upvotes: 2

Related Questions