rodpadev
rodpadev

Reputation: 393

How do I approach a DynamoDB single table design?

I'm learning more about DynamoDB and I'm building a full serverless project using it as my main database. The project is a simple Poll app, where a user can create and vote in polls.

The data model for a SQL database is quite simple in fact this is all that it looks like poll app SQL schema

But with DynamoDB I need to think about access patterns and this is where it becomes a little bit confusing for me.

After watching a few courses on DynamoDB not specifically related to data modeling I was able to come up with this aggregate view, following the Single Table Design.

This was done in the NoSQL Workbench app

There are the same 3 entities within the same table but I have a question noSQL workbench

These are the access patterns I need for my app, at least the ones I can think for now

[Put, Delete, Get] User by userId
[Get] All Users by pollId (who voted in what)
[Put, Delete, Get] Poll by pollId
[Put, Delete, Get] All Polls by userId
[Put, Delete, Get] All Polls by visibility
[Create, Get] Poll Votes by pollId

Are Global Secondary Indexes all I need to satisfy these access patterns?

Are there any pitfalls in the way that I designed the data?

Are there any other tips you might recommend?

Upvotes: 1

Views: 814

Answers (1)

Ross Williams
Ross Williams

Reputation: 632

You are correct that using a GSI will be helpful to design a good access pattern here, but in your NoSQL Workbench screenshot you are not showing GSIs. Instead you are showing a batch write (or transact write) operation for each vote operation. 1. pk: user Id, sk: poll id, 2. pk: poll id, sk: user Id.

This is less efficient if you want to enforce a rule that no user can vote in the same poll twice. To enforce this with the current key structure you will need to use the transact write API call, which is more expensive.

Instead you can try to use a single write for a vote, keep attributes needed for condition checks in the pk or sk, and use a GSI to offer alternative access patterns.

Alternative Pattern: Model: Vote pk: pollId, sk: userId gsi1pk: userId, gsi1sk: pollId

Model: User pk: userId, sk: userId

Model: Poll pk: pollId sk: pollId

This has a tradeoff in that either 'Get all Poles (votes) by UserId' or 'Get all Users by PollId' will require 2 read operations. Choose the Vote pk and sk to favour the read operation which will see more traffic.

This design also suggests you may want to denormalise some data from the Poll model to the Vote model.

If you application is massively read heavy, your original model could be the correct choice. But consider that a query can read an extra data item for 0.5 RRU, and 1 on-demand RRU costs $0.25/million, whereas a transact write operation costs 2 WRU per item, and 1 on-demand WRU costs $1.25/million. This implies that unless your query 'Get all polls for a user' gets at least 20x the traffic of 'create vote', it is better to use a GSI (GSI costs an extra 1 WRU, but 3 transact write items costs 6 WRU).

Even using a read then batch-write to enforce uniqueness still causes an extra 1 WRU + 1 RRU in consumption for each vote..

Upvotes: 1

Related Questions