MongoDB Schema Design suggestion

Question

I've used MongoDB for a while but i've only used it for doing CRUD operations when somebody else has already done the nitty-gritty task of designing a schema. So, basically this is the first time i'm designing a schema and i need some suggestions. The data i will collect from users are their regular information, their health related information and their insurance related information. A single user will not have multiple health and insurance related information so it is a simple one-to-one relation. But these health and insurance related information will have lots of fields. So my question is. is it good to have a separate collection for health and insurance related information as :

     var userSchema = {
              name : String,
              age  : Number,
    health_details :  [{ type: Schema.Types.ObjectId, ref: 'Health' }],//reference to healthSchema
 insurance_details :  [{ type: Schema.Types.ObjectId, ref: 'Insurance' }] //reference to insuranceSchema    
     }

or to have a single collection with large number of fields as:

     var userSchema = {
              name : String,
              age  : Number,
          disease_name : String, // and many other fields related to health
          insurance_company_name : String //and many other fields related to insurance
     }

Talha Awan · Accepted Answer

Generally, some of the factors you can consider while modeling 1-to-1, 1-to-many and many-to-many data in NoSql are:

1. Data duplication

Do you expect data to duplicate? And that too not in a one word way like hobby "gardening", which many users can have and which probably doesn't need "hobbies" collection, but something like author and books. This case guarantees duplication.

An author can write many books. You should not be embedding author even in two books. It's hard to maintain when author info changes. Use 1-to-many. And reference can go in either of the two documents. As "has many" (array of bookIds in author) or "belongs to" (authorId in each book).

In case of health and insurance, as data duplication is not expected, single document is a better choice.

2. Read/write preference

What is the expected frequency of reads and writes of data (not collection)? For example, you query user, his health and insurance record much more frequently than updating it (and if 1 and 3 are not much of a problem) then this data should preferably be contained in and queried from a single document instead of three different sources.

Also, one document is what Mongodb guarantees atomicity for, which will be an added benefit if you want to update user, health and insurance all at the same time (say in one API).

3. Size of the document

Consider this: many users can like a post and a user can like many posts (many-to-many). And as you need to ensure no user likes a post twice, user ids must be stored somewhere. Three available options:

keep user ids array in post document
keep post ids array in user document
create another document that contains the ids of both (solution for many-to-many only, similar to SQL)

If a post is liked by more than a million users the post document will overflow with user references. Similarly, a user can like thousands of posts in a short period, so the second option is also not feasible. Which leaves us with the third option, which is the best for this case.

But a post can have many comments and a comment belongs to only one post (1-to-many). Now, comments you hardly expect more than a few hundreds. Rarely thousand. Therefore, keeping an array of commentIds (or embedded comments itself) in post is a practical solution.

In your case, I don't believe a document which does not keep a huge list of references can grow enough to reach 16 MB (Mongo document size limit). You can therefore safely store health and insurance data in user document. But they should have keys of their own like:

 var userSchema = {
          name : String,
          age  : Number,
          health : {
             disease_name : String,
             //more health information
          },
          insurance :{
             company_name : String,
             //further insurance data
          }
 }

That's how you should think about designing your schema in my opinion. I would recommend reading these very helpful guides by Couchbase for data modeling: Document design considerations, modeling documents for retrieval and modeling relationships. Although related to Couchbase, the rules are equally applicable to mongodb schema design as both are NoSql and document oriented databases.

MongoDB Schema Design suggestion

Answers (1)

Related Questions