Reputation: 2040
I need a table which should have following attributes: userId, commentId, commentTopic, commentCountry. I want to avoid using Secondary Global indexes. What should be best design which will allow me the followings operations:
1) Get all comments belonging to a userId.
2) Get a comment with commentID = "commentID"
3) Get comments of all users where topic = "commentTopic" and country = "commentCountry"
I am thinking to have:
1) userId#commentId as partitionKey which will uniformly distribute the load and allow me to perform operation 1 and 2 from above list.
2) commentCountry#commentTopic as RangeKey which will allow me to perform operation 3 efficiently.
Is there any drawback of this approach. Will this table design be efficient? What are disadvantages of concatenating multiple keys together?
Upvotes: 1
Views: 621
Reputation: 906
PK = USER#userid
SK = COMMENT#commentId(GSIPK)
GSISK = commentTopic#country
Get all comments from a user
PK = USER#userid AND SK START_WITH COMMENT#
Get a comment by comment id can be against GSI
GSIPK = COMMENT#commentId
Get comments of all users where topic = "commentTopic" and country = "commentCountry"
GSIPK START_WITH COMMENT# AND GSISK = commentTopic#country
Upvotes: 0
Reputation: 8305
Here one idea that you can use. You can define one table Comments with the following attributes:
UserId (String) - partition key
CommentId (String) - sort key
Define the following GSI:
And the following GSI:
CommentTopicAndCommentCountry (String, GSI partition key) - combined field that has two values combined together, something like "123_Germany"
CommentId (String, GSI sort key) the very same field as before
Now your queries:
Get all comments belonging to a userId
this is pretty straightforward, just use partition key/sort key and specify only a partition key.
Get a comment with commentID = "commentID"
Use first GSI and provide a comment id.
Get comments of all users where topic = "commentTopic" and country = "commentCountry"
Use second GSI and provide a value for "comment id and country" as a single partition key value.
Now the limitation is that you can only have up to 10GB of data for a single partition key, so if this is likely that you can end up having more than 10GB of messages written by a user or more than 10GB of messages in a single topic, you may want to use multiple tables instead of a single table.
You can have Comments_2017, Comments_2016, etc. to store comments for different years. You can go more granular than that if you want to.
An additional benefit of this is that you can set different RCU/WCU values for different tables. I would imagine that newer comments are read more often, so you can set high RCU/WCI for the latest table, but old comments are not written at all and maybe read less often. In this case you can set less RCUs there.
Upvotes: 2