Reputation: 19764
I'm evaluating the performance of the following query.
db.products_old.find({ regularPrice: { $lte: 200 } })
The collection has a bit over a million documents, in total around 0.15GB.
This is expected. A full column scan has to be done
"executionTimeMillis" : 1019,
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"regularPrice" : {
"$lte" : 200
}
},
"direction" : "forward"
},
"executionTimeMillis" : 2842,
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"regularPrice" : 1
},
"indexName" : "regularPrice_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"regularPrice" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"regularPrice" : [
"[-inf.0, 200.0]"
]
}
}
},
Now it uses the index, but the execution time is noticeably worse. Why?
Also, if it's worse performance, why doesn't Mongo use the COLLSCAN
instead of using the index which slows down the execution? rejectedPlans
is empty, which suggests that no other plan was even considered. Why?
Here's the full allPlansExecution
output.
Upvotes: 4
Views: 542
Reputation: 468
While doing COLLSCAN
, MongoDB is reading from the storage drive and storing matching documents in the RAM for later use directly. On the other hand, IXSCAN
reads the index which stores indexed data and pointers to their location on the storage drive. (Here's a nice visualisation from slide 6 to around slide 20)
You have a lot of documents in your collection, but you also have a lot of matching documents in your index. The data stored on the storage drive is not stored in the best way to read it from it (like it is in the index), so when the IXSCAN
returns pointers to 220k+ documents it found for your query, FETCH
needs to read 220k+ times from the storage drive in a random access way. Which is slow. On the other hand I assume that COLLSCAN
is doing sequential read which is probably done page by page and is a lot faster than FETCH
reads.
So to sum up: it's not the index that's slowing you down, it's the FETCH
stage. If you want to still use this index and have a faster query execution time, then use .select('-_id regularPrice')
which will just add a quick PROJECTION
stage and read all necessary fields from the index. Or if you need _id
, then add an index {regularPrice: 1, _id: 1}
.
And regarding the part why does Mongo use index even though it could know that collection scanning is faster: well I think that if it sees an index, it will use it. But you can force it to use collection scan by using hint
method with {natural: 1}
passed to it.
Upvotes: 4