Mistre83
Mistre83

Reputation: 2827

MongoDB - Index scan low performance

I'm very new to MongoDB and i'm trying to test some performance in order to understand if my structure is fine.

I have a collection with 5 fields (3 date, one Int and one pointer to another ObjectId)

In this collection i've created an index on two fields:

The index name is: _p_monitor_ref_1_collected_-1

I've created this index in the beginning and populated the table with some records. After that, i've duplicated the records many times with this script.

var bulk = db.measurements.initializeUnorderedBulkOp();
db.measurements.find().limit(1483570).forEach(function(document) {
    document._id = new ObjectId();
    bulk.insert(document);
});

bulk.execute();

Now, the collection have 3 million of document.

Now, i try to execute explain to see if the collection use the index and how many time is needed to be executed. This is the query:

db.measurements.find({ "_p_monitor_ref": "Monitors$iKNoB6Ga5P" }).sort({collected: -1}).explain()

As you see, i use _p_monitor_ref to search all documents by pointer, and then i order for collected -1 (this is the index)

This is the first result when i run it. MongoDB use the index (BtreeCursor _p_monitor_ref_1_collected_-1) but the execution time is very hight "millis" : 120286,:

{
    "cursor" : "BtreeCursor _p_monitor_ref_1_collected_-1",
    "isMultiKey" : false,
    "n" : 126862,
    "nscannedObjects" : 126862,
    "nscanned" : 126862,
    "nscannedObjectsAllPlans" : 126862,
    "nscannedAllPlans" : 126862,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 23569,
    "nChunkSkips" : 0,
    "millis" : 120286,
    "indexBounds" : {
        "_p_monitor_ref" : [
            [
                "Monitors$iKNoB6Ga5P",
                "Monitors$iKNoB6Ga5P"
            ]
        ],
        "collected" : [
            [
                {
                    "$maxElement" : 1
                },
                {
                    "$minElement" : 1
                }
            ]
        ]
    },
    "server" : "my-pc",
    "filterSet" : false
}
{
    "cursor" : "BasicCursor",
    "isMultiKey" : false,
    "n" : 2967141,
    "nscannedObjects" : 2967141,
    "nscanned" : 2967141,
    "nscannedObjectsAllPlans" : 2967141,
    "nscannedAllPlans" : 2967141,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 27780,
    "nChunkSkips" : 0,
    "millis" : 11501,
    "server" : "my-pc",
    "filterSet" : false
}

Now, if i execute the explain again this is the result and the time is "millis" : 201:

{
    "cursor" : "BtreeCursor _p_monitor_ref_1_collected_-1",
    "isMultiKey" : false,
    "n" : 126862,
    "nscannedObjects" : 126862,
    "nscanned" : 126862,
    "nscannedObjectsAllPlans" : 126862,
    "nscannedAllPlans" : 126862,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 991,
    "nChunkSkips" : 0,
    "millis" : 201,
    "indexBounds" : {
        "_p_monitor_ref" : [
            [
                "Monitors$iKNoB6Ga5P",
                "Monitors$iKNoB6Ga5P"
            ]
        ],
        "collected" : [
            [
                {
                    "$maxElement" : 1
                },
                {
                    "$minElement" : 1
                }
            ]
        ]
    },
    "server" : "my-pc",
    "filterSet" : false
}
{
    "cursor" : "BasicCursor",
    "isMultiKey" : false,
    "n" : 2967141,
    "nscannedObjects" : 2967141,
    "nscanned" : 2967141,
    "nscannedObjectsAllPlans" : 2967141,
    "nscannedAllPlans" : 2967141,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 23180,
    "nChunkSkips" : 0,
    "millis" : 651,
    "server" : "my-pc",
    "filterSet" : false
}

Why i have this two very different results ? Maybe the second execution take the data from some kind of cache...

Now, the collection have 3 million of record... what if the collection will grow and become 10/20/30 million ?

I dont know if i'm doing something wrong. Sure, i'm executing it on my Laptop (i dont have a SSD).

Upvotes: 1

Views: 78

Answers (1)

profesor79
profesor79

Reputation: 9473

The reason why you have smaller execution time at second attempt is connected with fact, that first attempt forced mongo to load data into memory and data was still available in memory when second attempt was executed.

When your collection will grow, index will grow as well - so that could affect that it will be to big to fit in free memory blocks and mongodb engine will load/unload part of that index - so performance will vary.

Upvotes: 1

Related Questions