Raph
Raph

Reputation: 384

How to model list of elements in DocumentDB?

I would like to build a user activity feed of heterogeneous elements with DocumentDB.

I consider 3 modeling scenarios according to this link :

https://github.com/Azure/azure-content/blob/master/articles/documentdb/documentdb-modeling-data.md#when-not-to-embed

A single document per user with a nested array of feed elements

{
"userId": "1",    
"feed": [
    {"id": 1, "author": "anon", "image": "https://image.com/y.jpg"},
    {"id": 2, "author": "bob", "status": "wisdom from the interwebs"},
    …
    {"id": 100001, "author": "jane", "quote": "and on we go ..."},
    …
    {"id": 1000000001, "author": "angry", "status": "blah angry blah angry"},
    …
    {"id": ∞ + 1, "author": "bored", "xxx": "oh man, will this ever end?"}
    ]
}

Seems bad scenario because Document get some size limitations, so it's not scalable.

One document per feed element

{
"userId": "1",    
"id": 1, 
"author": "anon", 
"image": "https://image.com/y.jpg"
},
{
"id": 2,
"author": "bob", 
"status": "wisdom from the interwebs"
},...

Seems good solution, but I feel like waste the potential of DocumentDB, too much flat? Maybe not optimized.

X documents with 1 nested array of feed elements

{
"userId": 1
"feed": [
    {"id": 4, "author": "anon", "image": "https://image.com/y.jpg"},
    {"id": 5, "author": "bob", "status": "tails from the field"},
    ...
    {"id": 99, "author": "angry", "status": "blah angry blah angry"}
]
},
{
"userId": 1
"feed": [
    {"id": 100, "author": "anon", "status": "yet more"},
    ...
    {"id": 199, "author": "bored", "xxx": "will this ever end?"}
]
}

Seems like best solution but add a lot of complexity into code (handle delete operations and pagination, handle WHERE clause with different feed types...). I feel like adding functional part (pagination) into storage architecture. Less flexible.

Obviously, scenario 1 is not an option. What do you think of scenarios 2 and 3 ?

Upvotes: 0

Views: 157

Answers (1)

Petar Ivanov
Petar Ivanov

Reputation: 136

Based on my experience keeping it simple pays off with DocumentDb (considering the current limitations). Scenario 2 allows for the most straightforward way to manage CRUD operations, and keep the code simple. It seems like that with the third approach you will need to constantly check the size a document has accumulated, which is also cumbersome.

That being said, if bulk insert is to be used, keep in mind that for it you will also need to partition the number of records sent in a batch(even with Scenario 2) as stored proc calls are limited to 512 kb as well.

From my experience hietarchical structures are powerful in DocumentDb, as long as they can describe a piece of information to be looked at as a whole. If you need to look at nested parts of different documents - joins and stored procedures can help with that.

Hope this is useful despite any personal opinion in this post :)

Upvotes: 1

Related Questions