archLucifer
archLucifer

Reputation: 389

How do I select certain fields from a MongoDB and return an empty object or string if it doesn't exist?

I am using Nodejs and the native MongoDB driver (1.4.35), I am streaming a collection to the client, to optimize the stream and server CPU usage I want to select a certain field from the collection and if the field doesn't exist I want to it to return a blank object instead of returning nothing.

What I am currently doing:

    var cursor = req.db.collection("customer")
    .find(req.dbQuery, {
        'recipientDetails.firstname': true,
        'recipientDetails.surname': true,
        'recipientDetails.emailAddress': true,
        'recipientDetails.mobileNumber': true
    })
    .batchSize(parameters.mongoBatchSize);
    req.stream = cursor.stream();
    req.stream.on('data', function (doc) {
        res.write(toCsv(doc)); //doc is empty if recipientDetails doesn't exist
    });
    req.stream.on('close', function () {
        res.end();
    });

How I currently check if recipientDetails exists:

function toCsv(doc) {
    if (doc.recipientDetails) {
        return checkUndefined(doc.recipientDetails.firstname) + ',' +
            checkUndefined(doc.recipientDetails.surname) + ',' +
            checkUndefined(doc.recipientDetails.emailAddress) + ',' +
            checkUndefined(doc.recipientDetails.mobileNumber) + ',';
    }
    return ",,,,,,";
}
function checkUndefined(value) {
   return value ? value : "";
}

The above returns an empty string if the field doesn't exist, and works. However, the problem comes in when the amount of documents increase, the CPU takes long to process it and causes other requests to hang.

What I want I do so I don't need the if statements and checks to increase the performace:

 var cursor = req.db.collection("customer")
.find(req.dbQuery, {
    'recipientDetails.firstname': true || null, //don't know the correct syntax, but I want the result to be recipientDetails.firstname="" if it doesn't exist
    'recipientDetails.surname': true,
    'recipientDetails.emailAddress': true,
    'recipientDetails.mobileNumber': true
})
.batchSize(parameters.mongoBatchSize);

Upvotes: 0

Views: 203

Answers (2)

archLucifer
archLucifer

Reputation: 389

I found a way to optimize the server by changing the find to an aggregate, thus, letting mongo do all the work. It is also important to note that this approach is much more resource intensive on mongo - which is a good trade off as mongo is already optimized.

What I changed:

var cursor = req.db.collection("customer")
.aggregate([
        {$match: req.dbQuery},
        {
            $project: {
                'recipientDetails.firstname': {"$ifNull": ["$recipientDetails.firstname", ""]},
                'recipientDetails.surname': {"$ifNull": ["$recipientDetails.surname", ""]},
                'recipientDetails.emailAddress': {"$ifNull": ["$recipientDetails.emailAddress", ""]},
                'recipientDetails.mobileNumber': {"$ifNull": ["$recipientDetails.mobileNumber", ""]}
            }
        }
    ], {"cursor": {"batchSize": parameters.mongoBatchSize}});
req.stream = cursor; // No longer cursor.stream() as the aggregate is a stream already.

When the {"$ifNull": ["$recipientDetails.firstname", ""]} is used, mongo checks to see if the field is null, if it is not null, it returns the first field in the array above and if it is null it returns the second field.

Upvotes: 1

Molda
Molda

Reputation: 5704

While I'm not sure if it's possible that mongo returns empty string for a field that doesn't exist( I doubt it it can ) you can certainly simplify the code you currently have to something like

req.stream.on('data', function (doc) {

    var rd = doc.recipientDetails;

    if(!rd) return res.write(',,,,,,');

    res.write(rd.firstname || "" + "," +  rd.surname || "" + "," + rd.emailAddress || "" + "," + rd.mobileNumber || "");

});

Upvotes: 0

Related Questions