YulePale
YulePale

Reputation: 7706

MongoDB: Querying multiple fields and indexing

I am using MongoDB via mongoose.

  1. What happens when you query an indexed field with a non-indexed field? See code below.

    // For example
    MyModel.find({ _id: '123', name: 'Jina'});
    

    Does MongoDB do a collection scan or does the index help in making the query efficient? And is the query any different from just using the _id field?

  2. If MongoDB does a collection scan when you query a non-indexed field. Does querying multiple non-indexed fields speed up the query speed if a collection scan will be done all the same? Lets say if I query five non-indexed fields as opposed to two(both returning the same document(s)). Did both queries do the same collection scan?

Upvotes: 1

Views: 356

Answers (1)

Ana Lava
Ana Lava

Reputation: 787

To understand what is happening under the hood in mongo queries, you can use explain. For example, consider the following query: db.getCollection('users').find({"name":"ana"}) which queries a non-indexed field. You can use explain on this query as below:

db.getCollection('users').find({"name":"ana"}).explain("executionStats")

Part of the result is:

"queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "anonymous-chat.users",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "name" : {
                "$eq" : "ana"
            }
        },
        "winningPlan" : {
            "stage" : "COLLSCAN",
            "filter" : {
                "name" : {
                    "$eq" : "ana"
                }
            },
            "direction" : "forward"
        },
        "rejectedPlans" : []
    },

As you can see, here we had a COLLSCAN which is a collection scan. Now we just query _id and see the result:

db.getCollection('users').find({"_id":ObjectId("5ee9b6c125b9a9a426d9965f")}).explain("executionStats")

Here is the result:

"queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "anonymous-chat.users",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "_id" : {
                "$eq" : ObjectId("5ee9b6c125b9a9a426d9965f")
            }
        },
        "winningPlan" : {
            "stage" : "IDHACK"
        },
        "rejectedPlans" : []
    },

As we can see, we have IDHACK when querying just _id.

Now we combine _id and name:

db.getCollection('users').find({"_id":ObjectId("5ee9b6c125b9a9a426d9965f"), "name":"ana"}).explain("executionStats")

This is the result:

 "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "anonymous-chat.users",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [ 
                {
                    "_id" : {
                        "$eq" : ObjectId("5ee9b6c125b9a9a426d9965f")
                    }
                }, 
                {
                    "name" : {
                        "$eq" : "ana"
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "filter" : {
                "name" : {
                    "$eq" : "ana"
                }
            },
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "_id" : 1
                },
                "indexName" : "_id_",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                    "_id" : []
                },
                "isUnique" : true,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "_id" : [ 
                        "[ObjectId('5ee9b6c125b9a9a426d9965f'), ObjectId('5ee9b6c125b9a9a426d9965f')]"
                    ]
                }
            }
        },
        "rejectedPlans" : []
    },

As we can see, The index helped with the query performance as we have two stages, an IXSCAN (Index Scan) and a FETCH stage which filters the documents of the last stage.

Now let's query multiple non-indexed fields to find out about your second question:

db.getCollection('users').find({"name":"ana", "appId":1}).explain("executionStats")

"queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "anonymous-chat.users",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [ 
                {
                    "appId" : {
                        "$eq" : 1.0
                    }
                }, 
                {
                    "name" : {
                        "$eq" : "ana"
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "COLLSCAN",
            "filter" : {
                "$and" : [ 
                    {
                        "appId" : {
                            "$eq" : 1.0
                        }
                    }, 
                    {
                        "name" : {
                            "$eq" : "ana"
                        }
                    }
                ]
            },
            "direction" : "forward"
        },
        "rejectedPlans" : []
    },

There is only one collection scan for multiple fields as we can see above.

Upvotes: 3

Related Questions