Srinivas Nahak
Srinivas Nahak

Reputation: 1876

How to find null documents in mongodb?

I'm a complete beginner in mongodb . Actually I'm trying to find all the documents containing null or nothing for example documents like { "_id" : "abc" } for deleting them from collection.

But even after searching a lot of SO questions I couldn't get any solution .So, how can I do this ? and sorry if I'm ignoring anything.

Upvotes: 1

Views: 1951

Answers (2)

chridam
chridam

Reputation: 103365

One possible solution is to get a list of the _id values of those null field documents and then remove them. This can be significantly efficient considering that you only execute two queries instead of looping through the whole collection (this can potentially affect your db performance especially with large collections).

Consider running the following aggregate pipeline to get those ids:

var ids = db.collection.aggregate([
    { "$project": {
        "hashmaps": { "$objectToArray": "$$ROOT" } 
    } }, 
    { "$project": {
        "keys": "$hashmaps.k"
    } },
    { "$redact": {
            "$cond": [
                {
                    "$eq":[
                        {
                            "$ifNull": [
                                { "$arrayElemAt": ["$keys", 1] },
                                0
                            ]
                        },
                        0
                    ]
                },
                "$$KEEP",
                "$$PRUNE"
            ]
    } },
    { "$group": {
        "_id": null,
        "ids": { "$push": "$_id" }
    } }
]).toArray()[0]["ids"];

Removing the documents

db.collection.remove({ "_id": { "$in": ids } });

The other approach is similar to the above in that you would need two queries; the first which returns a list of all the top level fields in the collection and the last removes the documents from the collection which do not have those fields altogether.

Consider running the following queries:

/* 
   Run an aggregate pipeline operation to get a list 
   of all the top-level fields in the collection
*/
var fields = db.collection.aggregate([
    { "$project": {
       "hashmaps": { "$objectToArray": "$$ROOT" } 
    } }, 
    { "$project": {
       "keys": "$hashmaps.k"
    } },
    { "$group": {
        "_id": null,
        "fields": { "$addToSet": "$keys" }
    } },
    { "$project": {
            "fields": {
                "$setDifference": [
                    {
                        "$reduce": {
                            "input": "$fields",
                            "initialValue": [],
                            "in": { "$setUnion" : ["$$value", "$$this"] }
                        }
                    },
                    ["_id"]
                ]
            }
        }
    }
]).toArray()[0]["fields"];

The second query looks for the existence of all the fields except the _id one. For example, suppose your collection has documents with the keys _id, a, b and c, the query

db.collection.find({ 
    "a" : { "$exists": false },
    "b" : { "$exists": false },
    "c" : { "$exists": false }
}); 

matches documents that do not contain the all the three fields a, b AND c:

So if you have a list of the top level fields in your collection then all you need is to construct the above query document. Use reduce method on the array for this:

// Construct the above query
var query = fields.reduce(function(acc, curr) {
    acc[curr] = { "$exists": false };
    return acc;
}, {});

Then use the query to remove the documents as

db.collection.remove(query);

Upvotes: 0

ema
ema

Reputation: 5773

I don't know how to do it in a single operation, but you can try something like this:

db["collectionName"].find({_id: {$exists: true}}).forEach(function(doc) {
  if (Object.keys(doc).length === 1) {
    // ..delete this document db["collectionName"].remove({_id: doc._id})
  }
})

Upvotes: 3

Related Questions