MongoDB Schema Design With or Without Null Placeholders

Question

I still getting used to using a schema-less document oriented database and I am wondering what a generally accepted practice is regarding schema designs within an application model.

Specifically I'm wondering whether it is a good practice to use enforce a schema within the application model when saving to mongodb like this:

{
    _id: "foobar",
    name: "John"
    billing: {
        address: "8237 Landeau Lane",
        city: "Eden Prairie",
        state: "MN",
        postal: null
    }
    balance: null,
    last_activity: null
}

versus only storing the fields that are used like this:

{
    _id: "foobar",
    name: "John"
    billing: {
        address: "8237 Landeau Lane",
        city: "Eden Prairie",
        state: "MN"
    }
}

The former is self-descriptive which I like, while the latter makes no assumptions on the mutability of the model schema.

I like the first option because it makes it easy to see at a glance what fields are used by the model yet currently unspecified, but it seems like it would be a hassle to update every document to reflect a new schema design if I wanted to add an extra field, like favorite_color.

How do most veteran mongodb users handle this?

Vladimir Perevalov · Accepted Answer

I would suggest second approach.

You can always see the intended structure if you look at your entity class in the source code. Or do you use dynamic language, and don't create an entity?
You save a lot of space per record, because you don't have to store null column names. This may not be expensive on small collections. But on large, with millions of records, I would even go to shorten the names of fields.
As you already mentioned. By specifying optional column names, you create a pattern, which, if you want to follow, you'll have to update all existing records when you add a new field. This is, again, a bad idea for a big DB.

In any case it all goes down your db size. If you don't target for many GBs or TBs of data, then both approaches are fine. But, if you predict, that your DB may grow really large, I would do anything to cut the size. Spending 30-40% of storage for column names is a bad idea.

MongoDB Schema Design With or Without Null Placeholders

Answers (2)

Edit

Related Questions