cdub
cdub

Reputation: 25711

What's cheaper on DynamoDB (GSI vs multiple tables)

I have an issue of making a username AND an email unique. It is quite easy with relationaldatabase and just do 2 queries and get the count back on each.

select count(email) from users;
select count(username) from users;

But in DynamoDB (NoSQL) is it better (i.e. cheaper) to have 2 tables like so:

 username table (where username is the hash) and check that table with a PUT and attribute_does_not_exist
AND
email table (where email is the hash) and check that table after the first one with a PUT and attribute_does_not_exist

OR do I

email table (hash) and username (GSI in that table). Then query the GSI first and if it doesn't exist then do a PUT with email and username

Which is better (cheaper)?

Upvotes: 0

Views: 2988

Answers (1)

JaredHatfield
JaredHatfield

Reputation: 6661

Two questions so I'll address them separately.

Which is cheaper?

You can run a single table with one GSI or two tables for the exact same cost if you want to because throughput for GSIs are provisioned the same way the primary table's throughput is.

Cost should not be a deciding factor.

Which is better?

The fact DynamoDB makes it difficult to have a secondary attribute retain its uniqueness is difficult is a common problem. Because of the asynchronous nature of GSIs the HASH or HASH/RANGE combination for a GSI is not unique. This can be taken advantage of in some circumstances.

If you use two tables you are taking the responsibility for keeping both tables in sync (something that is not easy to do in many situations). This comes with some important responsibilities (what happens if your app dies after writing to the first table but before it writes to the second), but this additional responsibility could allow you to maintain the uniqueness you want.


To explain how you would actually accomplish the dual uniqueness while maintaining accuracy, you would want to take advantage of conditional writes. The following outline describes a series of steps that would ensure that you maintain uniqueness.

  1. Write record to username table with condition that username is not in the table, but include a conditional flag set to false (if write fails, we bail)
  2. Write record to email table with condition that email is not in the table (if write fails, we delete the previous username record)
  3. Update the username record to set the conditional flag to true

The reason you would want to use a conditional flag with the username to essentially indicate that the record is not in a valid state is to ensure you actually maintain the uniqueness.

Upvotes: 2

Related Questions