Reputation: 143

How to find the cause of slow query mongo DB?

My mongoDB collection (statistics) has 17 million documents.

I tried QUERY 1 :

db.statistics.find({
"updated_at": {$lt: ISODate("2018-08-21 01:00:00"),  $gte: ISODate("2018-08-21 00:00:00")}
});

Returned : 534 documents. Time : 170 seconds.

After then, I tried QUERY 2 :

db.statistics.find({
"updated_at": {$lt: ISODate("2018-08-21 01:00:00")}
});

Returned : 16 million documents. Time : 0.065 second .

I tried explain on that.

Result QUERY 1 :

{
    "stage" : "COLLSCAN",
    "filter" : {
        "$and" : [
            {
            "updated_at" : {
                "$lt" : ISODate("2018-08-21T08:00:00.000+07:00")
            }
        },
            {
            "updated_at" : {
                "$gte" : ISODate("2018-08-21T07:00:00.000+07:00")
            }
        }
        ]
    },
    "nReturned" : 534,
    "executionTimeMillisEstimate" : 131587,
    "works" : 17784357,
    "advanced" : 534,
    "needTime" : 17783822,
    "needYield" : 0,
    "saveState" : 139020,
    "restoreState" : 139020,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 17784355
}

Result QUERY 2 :

{
    "stage" : "COLLSCAN",
    "filter" : {
        "updated_at" : {
            "$lt" : ISODate("2018-08-21T08:00:00.000+07:00")
        }
    },
    "nReturned" : 16676319,
    "executionTimeMillisEstimate" : 135988,
    "works" : 17784357,
    "advanced" : 16676319,
    "needTime" : 1108037,
    "needYield" : 0,
    "saveState" : 139089,
    "restoreState" : 139089,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 17784355
}

QUERY 2 very fast and QUERY 1 very slow.

How to find where is the cause of that?

Upvotes: 0

Answers (2)

kevinadi

Reputation: 13805

If you do Query 1 & Query 2 in that order, a likely cause was that Query 1 took longer since it needs to load all the collection's data into RAM. Now when Query 2 runs, the whole collection is already in RAM.

Basically you have a cold cache for Query 1, and you have a warm cache for Query 2. If you reverse the order of execution, you should find that Query 1 will be fast.

This is because both explain output show two very similar plans:

Both of them show COLLSCAN stage, which means that MongoDB must load the whole collection into RAM, examine all 17,784,355 documents one by one, and return the requested documents.
Both queries examined the same number of documents (17,784,355)
Both show almost identical executionTimeMillisEstimate

What you want to do is to eliminate any COLLSCAN in your explain output. Ideally you'd like to see IXSCAN instead.

For example, if you create an index on the collection based on your query:

db.statistics.createIndex({updated_at: 1})

the index will significantly speed up the two queries you have.

More resources that you might find useful:

Upvotes: 2

ALPHA

Reputation: 1155

in this case you use ls and gt for query 1 That takes itself implicit mongoDb filtering using two parameters for whole data collection ( which has millions of records) thus giving it a more delay than just using lt. When using lt , it doesn't want to filter using both parameters just the values which are below you required parameter. that's why it behaves like that.

to mitigate this longer querying time, you can use index in mongoDb to specifically identify which fields should it query first. ( kind of like ObjectIds which has already indexed at the beginning of the document creation)

Upvotes: 0

How to find the cause of slow query mongo DB?

Answers (2)

Related Questions