Hommer Smith
Hommer Smith

Reputation: 27852

Difference between Embedded Array of Ids and Normalized style in MongoDB

So, I have been watching this video in order to learn MongoDB data modeling. In the one to many relationship, the speaker talks about three different kinds:

  1. Embedded array / array keys: In a particular document you would have a field that would be an array that references other documents (for example, blog_posts attribute in the user document would store all the ids of the blog posts that the user has created)
  2. Embedded tree: Rather than having an array with references to other things, we have documents in documents, completely embedded.
  3. Normalized: Which you have two collections and references between each other.

So, what would be the difference between the embedded array keys and the normalized kind? Isn't the embedded array also doing references two another collection?

Upvotes: 0

Views: 535

Answers (1)

WiredPrairie
WiredPrairie

Reputation: 59763

The difference is simple (and unfortunately a bit confusingly presented in that video).

Imagine modeling a blog post (Post) and comments (Comment).

  1. Embedded array: the Post document contains an array of all of the IDs of all of the Comment documents. The Comment is stored in a separate document (and/or collection).
  2. Tree: The Post document contains embedded Comments. They aren't stored in distinct documents or in their own collection. While this performs very well, the size limit of BSON documents being 16MB makes this potentially more difficult to work with.
  3. Normalized: A Post document, and Comments are stored separately. The Comment document in this case however has a foreign-key like reference back to the Post. So, it might have a field called postId for example. It would reference the Post related to the Comment. This pattern is different from #1 as the Post document does not contain a list of Comments. So, while this option makes it so that the number of Comments is essentially unbounded/unlimited, it could make retrieval of comments more inefficient without specific indexes being built (like a postId, commentDate might be useful).

Upvotes: 1

Related Questions