Reputation: 143
My mongoDB collection (statistics) has 17 million documents.
I tried QUERY 1 :
db.statistics.find({
"updated_at": {$lt: ISODate("2018-08-21 01:00:00"), $gte: ISODate("2018-08-21 00:00:00")}
});
Returned : 534 documents. Time : 170 seconds.
After then, I tried QUERY 2 :
db.statistics.find({
"updated_at": {$lt: ISODate("2018-08-21 01:00:00")}
});
Returned : 16 million documents. Time : 0.065 second .
I tried explain on that.
Result QUERY 1 :
{
"stage" : "COLLSCAN",
"filter" : {
"$and" : [
{
"updated_at" : {
"$lt" : ISODate("2018-08-21T08:00:00.000+07:00")
}
},
{
"updated_at" : {
"$gte" : ISODate("2018-08-21T07:00:00.000+07:00")
}
}
]
},
"nReturned" : 534,
"executionTimeMillisEstimate" : 131587,
"works" : 17784357,
"advanced" : 534,
"needTime" : 17783822,
"needYield" : 0,
"saveState" : 139020,
"restoreState" : 139020,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 17784355
}
Result QUERY 2 :
{
"stage" : "COLLSCAN",
"filter" : {
"updated_at" : {
"$lt" : ISODate("2018-08-21T08:00:00.000+07:00")
}
},
"nReturned" : 16676319,
"executionTimeMillisEstimate" : 135988,
"works" : 17784357,
"advanced" : 16676319,
"needTime" : 1108037,
"needYield" : 0,
"saveState" : 139089,
"restoreState" : 139089,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 17784355
}
QUERY 2 very fast and QUERY 1 very slow.
How to find where is the cause of that?
Upvotes: 0
Views: 730
Reputation: 13805
If you do Query 1
& Query 2
in that order, a likely cause was that Query 1
took longer since it needs to load all the collection's data into RAM. Now when Query 2
runs, the whole collection is already in RAM.
Basically you have a cold cache for Query 1
, and you have a warm cache for Query 2
. If you reverse the order of execution, you should find that Query 1
will be fast.
This is because both explain output show two very similar plans:
COLLSCAN
stage, which means that MongoDB must load the whole collection into RAM, examine all 17,784,355 documents one by one, and return the requested documents.executionTimeMillisEstimate
What you want to do is to eliminate any COLLSCAN
in your explain output. Ideally you'd like to see IXSCAN
instead.
For example, if you create an index on the collection based on your query:
db.statistics.createIndex({updated_at: 1})
the index will significantly speed up the two queries you have.
More resources that you might find useful:
Upvotes: 2
Reputation: 1155
in this case you use ls and gt for query 1 That takes itself implicit mongoDb filtering using two parameters for whole data collection ( which has millions of records) thus giving it a more delay than just using lt. When using lt , it doesn't want to filter using both parameters just the values which are below you required parameter. that's why it behaves like that.
to mitigate this longer querying time, you can use index in mongoDb to specifically identify which fields should it query first. ( kind of like ObjectIds which has already indexed at the beginning of the document creation)
Upvotes: 0