ktamas
ktamas

Reputation: 443

MongoDB: index to speed up querying not null fields

Let's suppose I have a webPages collection in mongoDB with following document structure inside:

{
    _id: ObjectId,
    url: String,
    title: String,
    links: [ComplexObject]
}

where ComplexObject is a complex object, with fields and lists inside (details are not important).

Question: what type of index & query combination should I use to be able to query those documents where links field is not null?

Note that collection has millions of documents, so the query has to use an index, otherwise it will be slow.

Also note that if not necessary, I don't want to have an index on the field links (besides the index which speeds up the non-null query) as it can be expensive/big and I don't need it.

Upvotes: 0

Views: 484

Answers (1)

Shivam Mishra
Shivam Mishra

Reputation: 1856

As per your query link:[ComplexObject] contains a complex object and have one to many relationships.

if it is complex you should go for a new collection referencing a root collection, because increase in size of single object of collection will slow down you query eventually. and of course it will be difficult to achieve what is desired.

Eg:

webPages collection

{
    _id: ObjectId,
    url: String,
    title: String,
}

links collection

{
 _id: ObjectId,
 webpage_id: ObjectId // indexed
 ....
}

Your Question:

what type of index & query combination should I use to be able to query those documents where links field is not null ?

Answer: in new approach i don't think you have to check for null object.

but for older approach you can go for $exists keyword in simple queries OR $ifNull pipeline with $size pipeline in aggregation. index like compound index will help a lot in your structure.

Note: Do not store null value in property either $set it or $unset it, then $exists will work.

Eg.

compound index

db.collection.createIndex( { "links.property1": 1, "links.property2": 1 } );

$exists

webPages.find({ $exists: { link: true }});

https://docs.mongodb.com/manual/reference/operator/query/exists/

aggregation

to check null values aggregation has a pipeline call $ifNull

webPages.aggregate([
{
        $addFields: {
            isExists: {
                $ifNull: [ "$links", 0]
            }
        }
    },
    {
        $match: {
            // validate isExists
        }
    }
]);

to check if array is empty

webPages.aggregate([[ { $addFields: { arrayLength: { $size: "$links" } } }, { $match: { // validate arrayLength } } ]);

there are alot way to handle such stuff.

Upvotes: 0

Related Questions