wmatt
wmatt

Reputation: 795

DynamoDB table design for notifications

Technologies used: AWS, Lambda, DynamoDB, Python.

I am not very experienced in DynamoDB/NoSQL and my case is the following:

  1. I need to store messages sent to users, each user (identified by user_id) can have multiple messages (identified by message_id)
  2. I need to sent notification to user about all his/her messages stored in the table
  3. Notifications need to be sent at specified time basing on user setting
  4. User can have multiple notification times set - this is not limited, so one user may want to be notified once a day for example at noon and the other user may want to be notified for ex. 4 times a day (ex. 7.15, 11:00, 15:00 and 18:00), full flexibility here is prefered

There will be a lambda running every couple of minutes to get the messages that I need to notify the users about. Lambda "knows what time it is" and wants to get only messages of users who want to get their notifications at this point in time basing on their preferences.

Current DynamodDB table design is the following: user_messages table - Primary Key (Partition Key: user_id, Sort Key: message_id), attributes: message_text, creation_time etc.

My struggle is - how to design the DB in optimal way to limit number of RCU's consumed and compute time on lambda when extracting those messages. It would be simpler if I'd allow each user to have only one notification time set. I'd just create a notification time attribute and a new GSI where the notification would be the partition key but this would limit the user too much.

I am not sure how to approach it in case of multiple notification times per user, got 2 possible scenarios now:

1. limit the notification setting time to N, for example 3 max per user, store the preferences in 3 attributes and create 3 GSI's, in such case the lambda would query the table 3 times each run - this doesn't look elegant and I am concerned about the hard limit on number of notifications

the table design would look like this in such case: user_messages table - Primary Key (Partition Key: user_id, Sort Key: message_id), attributes: message_text, creation_time etc., GSI_1 (notfication_time_1), GSI_2 (notification_time_2), GSI_3 (notification_time_3)

2. create a separate table with user preferences, like Partition Key: notification_time, attribute: user_id

In such case the lambda would have to get all user_id for a particular notification time and iterate over user_messages_table to get user messages, means if I have 1000 users to notify I'd need to query user_messages_table 1000 times. Doesn't look good from the performance point of view and will consume a lot RCU's.

Actually I am stuck here as none of the above solutions seems optimal for me.

Do you see any other approach I could take here?

Upvotes: 0

Views: 1378

Answers (1)

Maurice
Maurice

Reputation: 13108

My understanding is that you're gathering messages for each user in a table and depending on the user you want to send these notifications at different points in time.

Update: There are two solutions, I'm having a hard time deciding, but I'd probably go for #2

I'd probably go for a single table design like this:

PK SK GSI1PK GSI1SK type attributes
U#1 NC#1 NCT#08:30 U#1NC#1 NOTIFICATION_CONFIGURATION {time_of_day_in_utc: 08:30}
U#1 NC#2 NCT#17:30 U#1NC#1 NOTIFICATION_CONFIGURATION {time_of_day_in_utc: 17:30}
U#1 MSG#2021-02-27...#ID#123 MESSAGE {message_id: 123, create_time: 2021-02-27T09:30:00Z, body: bla
U#1 MSG#2021-02-27...#ID#789 MESSAGE {message_id: 789, create_time: 2021-02-27T10:30:00Z, body: blub
U#2 NC#1 NCT#10:15 U#1NC#1 NOTIFICATION_CONFIGURATION {time_of_day_in_utc: 10:15}
U#1 MSG#2021-02-27...#ID#654 MESSAGE {message_id: 654, create_time: 2021-02-27T10:30:00Z, body: test

PK is the partition key, SK the sort key, GSI1PK and GSISK are the partition and sort keys of a global secondary index GSI1.

Your Lambda function now has to perform the following steps:

  1. Get a list of users that need to be notified right now: Query @ GS1; GSIPK=NCT#<time>
  2. For each user in the result from 1)
  3. Query the primary index with PK=U#<user-id> and SK start_with MSG
  4. Send the messages
  5. Delete the messages from the table

This way you can do a KEYS_ONLY projection for GSI1, which saves on storage and RCU costs.

You'll have to query every user with a notification configuration when you send the message, but the actual RCUs should be fairly limited, it will just be a lot of requests.

You could also extend this design to store historical messages if you keep track of when the last notification was sent out to each user. Then you'd have an additional read for that attribute, but could change step 3 to a between query.


Alternative design

This may be better, although it might also result in hot partition for write loads.

PK SK type attributes
U#1 NC#1 NOTIFICATION_CONFIGURATION {time_of_day_in_utc: 08:30}
U#1 NC#2 NOTIFICATION_CONFIGURATION {time_of_day_in_utc: 17:30}
SM#17:30 U#1#ID#123 SCHEDULED_MESSAGE {message_id: 123, create_time: 2021-02-27T09:30:00Z, body: bla
SM#17:30 U#1#ID#789 SCHEDULED_MESSAGE {message_id: 789, create_time: 2021-02-27T10:30:00Z, body: blub
U#2 NC#1 NOTIFICATION_CONFIGURATION {time_of_day_in_utc: 10:15}
SM#10:15 U#2#ID#654 SCHEDULED_MESSAGE {message_id: 654, create_time: 2021-02-27T10:30:00Z, body: test

When you add a new message you do the following:

  1. Do a query on PK=U#<id>, SK starts_with NC to get all notification configurations
  2. Select the notification configuration that's closest to the current time (i.e. the point in time when the next notification will be sent.)
  3. Create a scheduled message as seen in the table with the GSI1PK being the result from 2)

The lambda that is supposed to send messages can now do this:

  1. Do a Query with PK=SM#<time> to get all messages that need to be sent out now
  2. For each message
    1. Send the message to the user
    2. Delete the message from the table

This way sending messages is cheaper, but changes to a notification period are applied with a delay. Or on changes to notifications periods for a user you'd have to update the scheduled messages.

Upvotes: 2

Related Questions