Reputation: 1443
I'm just starting out with node and mongodb and am trying to understand how best to structure the data (having come from a lifetime of sql).
So I've ended up with a data structure that's mostly embedded, I believe the relationships to be logical but would like some outside feedback before I go too far down the rabbit hole!
Here's my proposed new mongodb data model:
user
name (string)
email (string)
avatar (string)
password (string)
newsletter (binary)
account
admins
user (objectId)
name (string)
logo (string)
sub (number)
stripe (string)
property
users
user (objectId)
party (number)
role (number)
admin (binary)
name (string)
ecd (date)
complete (binary)
activity
description (string)
user (objectId)
time (date)
task_group
position (number)
name (string)
task
assinged
user (objectId)
complete (binary)
name (string)
description (string)
due (date)
visibility (number)
comment
user (objectId)
time (date)
comment (string)
Previously (I'm rebuilding an existing sql app) there were a lot of tables purely to bridge the data, i.e. account_link
to connect users with accounts (many to many) etc. These have now been embedded which allows for a slightly more intuitive structure. Given that the embedded data only needs to be accessed in the context of its parent I think this is the way to go.
My concern is that certain sub docs will grow quite large. Do I have to worry at all about how much data is contained in a sub doc? Or should I treat sub docs exactly as I would tables? i.e. if it transpires that each task_group
contains 400,000 tasks, will that unnecessary 'bloat' a property
? Is there a point where you split this content out and create 'linking tables' purely for practical/performance reasons? Or am I just so stuck in sql mindset that this just feels wrong?
Update
Given the advice received and referenced I believe I've produced a more appropriate design, although as has been noted elsewhere, it's more of an art than a science. Feedback still welcome!
Important considerations:
I won't re-write the linked blog post, but to summarise:
I've also accounted for growth/document size consistency as referenced in one of the answers.
USER
name (string)
email (string)
avatar (string)
password (string)
newsletter (binary)
ACCOUNT
admins (USER reference array)
name (string)
logo (string)
sub (number)
stripe (string)
properties (PROPERTY reference array)
PROPERTY
name (string)
ecd (date)
complete (binary)
users
user (USER objectId)
party (number)
role (number)
admin (binary)
activity
description (string)
time (date)
task_groups (TASK_GROUP reference array)
TASK_GROUPS
property (PROPERTY objectId)
position (number)
name (string)
task
assigned
user (USER objectId)
complete (binary)
name (string)
description (string)
due (date)
visibility (number)
comment
user (USER objectId)
time (date)
comment (string)
Upvotes: 1
Views: 133
Reputation: 71
I would even go so far as to seperate the task from the task group and make the group that it belongs to a property of the task. You may want to query for every task in the group. which you can do as long as you know which task it belongs to.
But you may also want to find a particular task, but the information about the group or groups it belongs to might still be relavent to that task. If you embed a task in a task group in that fashion you limit your application to having to look figure out what category/group the task might belong in. Maybe the groups function more like a filter, find a task with this description amongst these groups.
the different queries you might want to do on these structures becomes more obvious when you think about how you want to query. The next step being from query and building your model being indexing. if you have an index on an embedded document it should probably be a seperate model related to the original. but this also depends on how much the embedded document relies on the properties in the structure above it.
tl;dr;
I have found a common rule of thumb is, if you do a lot of reads and very very few writes with the embedded documents it is ok to embed. with heavy writes and reads you will want to seperate the embedded nature.
Upvotes: 0
Reputation: 4700
look this pictures before i will explain them:
every document in collection have its own place and space when documents grows and there are no enough space is goes at the and of the collection and free space is left behind for example you have post collection and it has embedded collection comments
post {
_id:ObjectId('101');
comments:[{author:'john',text:'some text'},{author:'mike',text:'some text'}]
}
this model is useful when you can add only one-two or three comments not a lot but when you can push comments as many as you need you must write document with references there will be post collection and comment collection
post collection document:
{
_id:ObjectId('101')
}
comments collection document:
{
_id:ObjectId('10001'),
_postId:ObjectId('101'),//references to post collection document!
text:'some text',
author:'john'
}
Upvotes: 1
Reputation: 4700
http://docs.mongodb.org/manual/tutorial/model-embedded-one-to-one-relationships-between-documents/ here are documentation about
1)Model One-to-One Relationships with 2)Model One-to-Many Relationships with Embedded Document 3)Model One-to-Many Relationships with Document References
read all three paragraphs
i will say only one when embedded docs grows quite large you have to use Document References not embedded
Upvotes: 0