marmor
marmor

Reputation: 28179

mongodb distinct sometimes uses index and sometimes doesn't

I have collection A with a non-unique index over "field1".

If I run:

db.A.explain().distinct("field1")

I get:

"winningPlan" : {
    "stage" : "PROJECTION",
    ...
    "inputStage" : {
        "stage" : "DISTINCT_SCAN",
        "keyPattern" : {
            "field1" : 1.0
        },
    ...
}

Which suggests it'll use the index for the distinct call.

However, in collection B with a non-unique index on "type2.key", if I run:

db.B.explain().distinct("type2.key")

I get:

"winningPlan" : {
    "stage" : "COLLSCAN",
    "filter" : {
        "$and" : []
    },
    ...
}

which seems to mean it doesn't use the index.

Why can distinct use the index on collection A but not on collection B, and can I do something to force the use of the index?

Notes:

  1. collection B is a lot bigger then collection A, is there a limit to the size of the index distinct can use?
  2. I've read: Count distinct values in mongoDB and MongoDB - distinct with query doesn't use indexes they don't help to explain the difference in behavior I'm seeing.
  3. Both collections are sharded
  4. mongodb version is 3.2.12

EXAMPLE DOCUMENT

{
    "_id" : ObjectId("57d6c1cf691fa014e0615aa7"),
    "type1" : [ 
        {
            "key" : "key1",
            "field" : "value1",
        },
        {
            "key" : "key2",
            "field" : "value2",
        }
    ],
    "type2" : [ 
        {
            "key" : "key3",
            "field" : "value3",
        },
        {
            "key" : "key4",
            "field" : "value4",
        }
    ]
}

The index is on type2.key

Upvotes: 3

Views: 1639

Answers (1)

Alex Blex
Alex Blex

Reputation: 37048

The rules when distinct index can be used are there https://github.com/mongodb/mongo/blob/v3.4/src/mongo/db/query/get_executor.cpp#L1104

Most important line for this particular case https://github.com/mongodb/mongo/blob/v3.4/src/mongo/db/query/get_executor.cpp#L1139 says:

Skip multikey indices if we are projecting on a dotted field.

"obj.field2" is a dotted field, so index does not apply.

So basically, distinct() can use indexes only for root fields, not for array nor subdocuments.

Upvotes: 3

Related Questions