Parveen
Parveen

Reputation: 149

Index on the embedded documents as whole

I have the following structure on the document

{
  "_id": ObjectId("1234"),
  "name": "TestName",
  "location": { state: "NY", city: "New York" ,"geoid":28042},
  "ts":<timestamp>
..
..
}.. 

I have around 1 GB of data on with the above collection.

I have created an index on the location i.e db.mydata.createIndex( { location: 1 } ). I have three following questions.

  1. What will be the impact on the performance if I create an index on the whole document?
  2. Since the index is on "location", if I use "location.state"="NY", does this query use the above-defined index?
  3. Does index is used when my query is "location.city"="New York" and "location.state"="NY"? Does order matters here?

Upvotes: 1

Views: 206

Answers (1)

Kevin Smith
Kevin Smith

Reputation: 14436

If you create an index on the subdocument location it will have same impact as creating any other index as it will need to do a full collection scan and build up the index.

With an index on location, you won't be able to do a query of location.state as this does not match the whole sub-document. You'll only be use the index if you do the following query to match the sub document:

db.test.find({ "location": { state: "NY", city: "New York" ,"geoid":28042} }

This is because the index is on the field location not on location.state, if you needed to match on state and city I'd suggest that you add an index on both them fields.

db.test.createIndex({"location.state" : 1, "location.city": 1});

That way your query could use the index for state and for state andcity`.

if we put this together in a simple example we can all run, we can look at the execution plan.

if let's insert a document with a sub-document.

db.test.insertOne({
  subDocument: { field1 : 1, field2 : 2 }
});

Now we can create an index on subDocument:

db.test.createIndex({subDocument:1})

Then we run a explain on the query for field1 only.

db.test.find( { "subDocument.field1" : 1 } ).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.test",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "subDocument.field1" : {
                                "$eq" : 1
                        }
                },
                "queryHash" : "CEC76D6B",
                "planCacheKey" : "CEC76D6B",
                "winningPlan" : {
                        "stage" : "COLLSCAN",
                        "filter" : {
                                "subDocument.field1" : {
                                        "$eq" : 1
                                }
                        },
                        "direction" : "forward"
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : { },
        "ok" : 1
}

as you can see the winning plan is COLLSCAN.

Now let's query based on the whole document instead.

db.test.find({"subDocument": {field1: 1, field2: 2 }}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.test",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "subDocument" : {
                                "$eq" : {
                                        "field1" : 1,
                                        "field2" : 2
                                }
                        }
                },
                "queryHash" : "3B0F9692",
                "planCacheKey" : "7CF9BAC4",
                "winningPlan" : {
                        "stage" : "FETCH",
                        "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                        "subDocument" : 1
                                },
                                "indexName" : "subDocument_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "subDocument" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "subDocument" : [
                                                "[{ field1: 1.0, field2: 2.0 }, { field1: 1.0, field2: 2.0 }]"
                                        ]
                                }
                        }
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : { },
        "ok" : 1
}

Now we can see it using the index scan IXSCAN.

Upvotes: 1

Related Questions