Mongo Tree Data Model Design

Question

My goal is to design a scalable recursive tree data model that is agnostic to vertical size, horizontal size, tree imbalances, and overall size.

On Mongo's site, they talk about tree structured data here:

http://docs.mongodb.org/manual/applications/data-models-tree-structures/

Interestingly, each data model they present indicate a new entry into a collection; even for the sub-elements

Let's call the following from mongodb.org example A:

db.categories.insert( { _id: "MongoDB", parent: "Databases" } )
db.categories.insert( { _id: "dbm", parent: "Databases" } )
db.categories.insert( { _id: "Databases", parent: "Programming" } )
db.categories.insert( { _id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )

Now, let's call this one example B:

singleEntry =
{
    _id: "Books",
    children:
    [
        {
            _id: "Programming",
            parent: "Books",
            children:
            [
                {
                    _id: "Languages",
                    parent: "Programming"
                },
                {
                    _id: "Databases",
                    parent: "Programming",
                    children:
                    [
                        {
                            _id: "MongoDB",
                            parent: "Databases"
                        },
                        {
                            _id: "dbm",
                            parent: "Databases"
                        }
                    ]
                }
            ]
        }
    ]
}

db.categories.insert(singleEntry)

I really like example B; though having the parent-child relation double referenced is uncomfortably redundant, I could not find a way to avoid this in practical usage. Also, the queries are a bit more involved:

db.categories.find(
    {
        'children.children.children._id' : 'MongoDB'
    }
)

but I don't mind as long as everything in example A is possible with example B.

I get the feeling it might not be. Other gotcha's I'm worried about are:

maximum entry size
maximum stack size
inserting into a highly nested collection entry creating havoc on the engine that rearranges stuff in memory

My initial understanding of Mongodb was that it's intended for schema design like this. However when I look at the documentation, examples, even the shell methods, it looks like they really intended it to be like example A.

Something about example A seems too relation-y; it's kind of what I was trying to get away from. If I go with example A, why not just use SQL? If I go with example B, what can I expect to run into?

Mongo Tree Data Model Design

Answers (1)

Related Questions