Ahmet K
Ahmet K

Reputation: 813

Pagination on nested arrays (attributes)

Im trying to follow the "one table" principle in my nosql db model for a social network. But it throws many questions on me.
Lets say my model right now looks like this:

Table-Groups
{
  name: "Group1"
  topics: [
    name: "Topic1"
    posts: [
      {
        id: "tid1"
        author: "Walter White"
        message: "Hello from Post1"
        comments: [
          {
            id: "cid1"
            author: "Jessy"
            message: "Yo nice post Mr. White"
          }
          {
            id: "cid2"
            author: "Saul"
            message: "Jeze Walt"
          }
        ]
      }
      //... Many other posts here    
    ]
      //... Many other topics within the group
  ]
}
//... Not so many other groups

Would I be able to paginate the post or comments array?
Since I would have (in theory) a lot of posts in the post array, I would have to read a lot of data where I actually just wanted to read the latest 10 posts. Same goes with the array comments in a post. Is there any possibilty to paginate these arrays ?

Can I use the attribute "name" in the topic array as sortKey ? (topic.name)
Is there any way to use an attribute of an nested array as a sort key ? In my example there are many topics in a group. So it would make sense to use topic names as sort keys (or even Partition key If I am allowed to split the table).


I have the feeling that I should split the Table in at least two. With that I could use topicname as partitionkey and group name as sort key. But Im realy new to nosql dbs and what I learned is that you should only use one table. What is your opinion ?

Upvotes: 2

Views: 894

Answers (2)

Soccergods
Soccergods

Reputation: 460

Like the other comment by Pedro said, you'll quickly fall into the issue of your item size being >400KB.

The whole point of a nosql database like dynamodb is that you should be able to model your system (regardless of how complex) into a single table. There is however no restriction on the table, but you should be able to model your current requirement using a single table.

Try to separate out groups, topics, posts & comments and use their ids as partitions keys. To implement pagination, you could query posts with a limit You could implement your table like how the other commenter has specified, and maybe add GSIs if you need different type of queries.

Upvotes: 0

Pedro Arantes
Pedro Arantes

Reputation: 5379

Would I be able to paginate the post or comments array?

No. Your model has a single item you call Group. When your server runs GetItem, all topics are returned for you, and, inside topics, all comments too.

There is another big problem in your model: your group can increase indefinitely and the max size of a DynamoDB item is 400 KB. Check this docs:

"The maximum item size in DynamoDB is 400 KB, which includes both attribute name binary length (UTF-8 length) and attribute value lengths (again binary length). The attribute name counts towards the size limit."

In other words, at some time, you won't be able to save more topics or posts.

Can I use the attribute "name" in the topic array as sortKey ? (topic.name)

No. Check this docs. It states: "Each primary key attribute must be a scalar (meaning that it can hold only a single value). The only data types allowed for primary key attributes are string, number, or binary. There are no such restrictions for other, non-key attributes."

I have the feeling that I should split the Table in at least two. With that I could use topicname as partitionkey and group name as sort key.

I don't think you should split into two tables. You could model you DynamoDB in this way and keep with only one table:

  1. Use hashKey and sortKey in your table.

  2. Save your Groups items like this:

    • hashKey: group (it's string group and not a variable)
    • sortKey: groupId
    • name: groupName
  3. Save your topics items this way:

    • hashKey: groupId
    • sortKey: topicId
    • name: topicName
  4. Save your posts items like this:

    • hashKey: topicId
    • sortKey: postId
    • author: author
    • message: message
  5. Save your comments items this way:

    • hashKey: postId
    • sortKey: commentId
    • author: author
    • message: message

With that, if you want to retrieve a single item, you run a GetItem with the full key: hashKey and rangeKey.

Instead, if you want to query with pagination, you provide only hashKey in your query and limit it by 10 as you want (docs about query limits).

Finally, if you want to query by time, the most recent in your case, you could prefix your sort keys with date/time. For instance, 2019-08-11-22-03-03_SOME_STRING. Check this docs about query using time.

Upvotes: 8

Related Questions