Ryan
Ryan

Reputation: 778

Node.js Redis Async Issues

I'm coming from a PHP background, trying to wrap my head around the "event driven" Node.js environment. I've written a small script that reads files from a directory and updates Redis with the title of the show, season, and episode numbers (if they are later than what is currently stored in the database). It appears I'm running into an asyncronous issue I can't quite wrap my head around where the Redis DB has a title of "My Show", season "05", and title "01". There are two files being read in that contain "My Show S05E02" and "My Show S05E01".

The database should only update if the season/episode is later than the current season/episode, however since "updateTitle" is called very quickly, and for some reason "My Show S05E02" is passed before "My Show S05E01" the update function always compares both values to the original value "My Show S05E01", thus it updates Redis with E02, then E01 again!

Here's the code:

function processFiles()
{
fs.readdir(WATCH_DIR, function(err, files){
    for (var i = 0; i <= files.length; i++)
    {
        checkFile(files[i]);
    }
});
}

function updateTitle(title, season, episode)
{
var cur_season, cur_episode;

redis_client.hget(title, 'Season', function(err, data){
    cur_season = data;
    redis_client.hget(title, 'Episode', function(err, data){
        cur_episode = data;
        redis_client.sismember('Titles', title, function(err, data){
            console.log('comparing S'+season+'E'+episode+' to current S'+cur_season+'E'+cur_episode);
            if ((season == cur_season && episode >= cur_episode) || season > cur_season)
            {
                redis_client.hset(title, 'Season', season);
                redis_client.hset(title, 'Episode', episode);
                console.log('setting '+title+' S'+season+'E'+episode);
            }
        });
    });
});
}

function checkFile(file, mtime)
{
var reg         = new RegExp("^"+FILE_PREFIX);
var seasoned    = new RegExp("S(\\d{2})E(\\d{2})", "i");

var cache = {}
if (reg.test(file))
{
    fs.stat(WATCH_DIR + file, function(err, stats){
        console.log(file, stats.mtime.toLocaleDateString() +' '+ stats.mtime.toLocaleTimeString() );
        fs.readFile(WATCH_DIR + file, 'utf8', function(ferr, data){
            if (seasoned.test(data))
            {
                title = data.replace(/S(\d{2})E(\d{2})(.*?)$/, '')
                    .replace(/[\._\-]+/, ' ')
                    .replace(/^\s+/, '')
                    .replace(/\s+$/, '');

                var season = data.match(/S(\d{2})/i);
                season = season[1];
                var episode = data.match(/E(\d{2})/i);
                episode = episode[1];
                updateTitle(title, season, episode);
            }
        });
    });
}
}

fs.watch(WATCH_DIR, function(type, file){
if (type == 'change')
{
    processFiles();
}
});

Any help would be appreciated. I'm sure there are other errors or best practices here, feel free to share those as well - but I'm just knocking my head against the wall trying to figure out the async issue!

FYI - This is just a pet project so that I can remember which episode I'm currently on for each of the shows I enjoy watching.

Upvotes: 0

Views: 856

Answers (3)

Nick Mitchinson
Nick Mitchinson

Reputation: 5480

A solution which has not been mentioned would be to utilize the async library. With this module you can easily serialize all the 'checkFile' calls. You will effectively add all the calls to a queue, with each call waiting for the previous one to complete before executing. While this solution may run a bit slower, you shouldn't end up with any of the control flow issues you seem to be experiencing.

Upvotes: 1

Bret Copeland
Bret Copeland

Reputation: 23990

The problem is that you don't have a guaranteed order of execution here. For example, it might happen in this order:

  1. Read File 1
  2. Make first get request to Redis
  3. Read File 2
  4. Make second get request to Redis
  5. First get from Redis returns.
  6. Second get from Redis returns.
  7. Perform first set to Redis.
  8. Perform second set to Redis.

If the first operation depends on the outcome of the second operation, you need to make sure the first one completes before starting the second, and that is generally accomplished using callbacks.

Consider this:

function doSomethingAsync (num) {
    console.log('Starting something ' + num);
    setTimeout(function () {
        console.log('Done doing something ' + num);
    }, 10);
}

function runEverything () {
    for (var i = 0; i < 3; i++)
        doSomethingAsync(i);
}

runEverything();

which outputs:

Starting something 0
Starting something 1
Starting something 2
Done doing something 0
Done doing something 1
Done doing something 2

However, if we add a callback structure and replace the loop in runEverything to make use of these callbacks, then it will wait for the previous doSomethingAsync to finish before it starts executing the next one:

function doSomethingAsync (num, callback) {
    console.log('Starting something ' + num);
    setTimeout(function () {
        console.log('Done doing something ' + num);
        callback();
    }, 10);
}

function runEverything () {
    var i = 0;
    var doneCallback = function () {
        if (++i < 3)
            doSomethingAsync(i, doneCallback);
    };

    doSomethingAsync(i, doneCallback);
}

which outputs:

Starting something 0
Done doing something 0
Starting something 1
Done doing something 1
Starting something 2
Done doing something 2

Welcome to Node.js.

Upvotes: 2

Michael Pratt
Michael Pratt

Reputation: 3496

This may not be the best solution to your problem, but it probably is the simplest.

You're correct - the reason it's getting set back to E01 is due to a race condition. It's attempting to set the episode to 02 then moving on to the next file, which is 01, but it hasn't caught up and finished setting it to 02 yet, so it reads 01. Here's the problematic line:

if ((season == cur_season && episode >= cur_episode) || season > cur_season)

As you can see, you're checking if episode >= cur_episode. Change that to episode > cur_episode and you should at least solve the episodes going backwards in this specific case. You might still run into problems if the current episode is 01, it reads in 03, then 02, and 02 overwrites 03 before 03 is written. You might try reading in all the changed episodes, then inspect them and decide which is last in-application rather than repeatedly writing and reading from redis.

Really you'd want to solve the race condition, which would solve the general case and make you a much stronger Node programmer, but then you'd get into a myriad of different ways to solve this (deferreds, callbacks, promises, etc...) which is far outside the bounds of this discussion. I suggest you read up, continue learning and good luck!

Upvotes: 1

Related Questions