BioXD
BioXD

Reputation: 357

Does unsetting a field improve performance if I query for documents that contain it?

I'm building an app where I have a standard collection with 500+ products. The company constantly runs sales, so at any given point, 2-10 products will be on sale.

I'm still trying to wrap my head around how to model in Mongo, but I'm trying to think in the "model after how the data will be accessed" style. Since the products page will be accessed more often than anything else, I'm thinking of adding the sale information directly into the product collection. Like this:

{
    _id: 1, 
    name: "Widget", 
    price: 15.99, 
    ...
    sale: {
        reducedPrice: 9.99
        saleStarts: "Nov 11, 2016", 
        saleEnds: "Nov 18, 2016", 
    }
}

I do have a page where all the current sales will be listed. It is not accessed as often, but it needs to exist. My question is about performance in that query as I don't want to go through every product every time that page is loaded and I'm trying to avoid duplicating information by having a second Sales collection.

As I understand, when Mongo goes trough the a collection, if I'm looking for something like this:

Products.find({ sale: { $exists: true } })

It doesn't really go trough all the records. So if I unset "sale" whenever a sale ends and just keep the field in the records that are currently on sale, then performance shouldn't really be too bad.

My question is: Am I missing something here? Is there a better way to do it?

Upvotes: 0

Views: 95

Answers (1)

Sven
Sven

Reputation: 5265

The way MongoDB works, and many other databases, is that you need an index on a field that you want to query for with reasonable performance. Indexes in databases are held in memory, meaning that a query against a field that is indexed will not require a scan against the hard drive, but rather scanned in-memory in an efficient-to-traverse data structure, thus resulting in much better performance. There are many other details to this that I don't need to get into and Google will explain very well.

You can read more about indexes on MongoDB's docs, but to really answer your question; if you don't have an index on the sale field in your collection, MongoDB will be forced to scan all documents in that collection from disk (although some may be cached in-memory).

You will have to find the sweet spot for how many indexes your server can hold, and tradeoff indexes for collections that isn't accessed as often as other collections. The more indexes you have, the more RAM the mongod daemon will consume.

Upvotes: 1

Related Questions