NoSQL Structuring of Data

Question

Coming from a relational database background, I find that sometimes finding the right way to structure my NoSQL databases is a challenge (yes, I realize the statement sounds silly). I work with DynamoDB.

If I have 3 entities - a user, a report and a building and many users can submit many reports on a building, would the following structure be acceptable?

User - index on userId
Building - index on buildingId
Report - index on reportId, userId and buildingId

Or do I need a fourth table to keep track of reports submitted by users? My points of concern are performance, throughput and storage space.

JaredHatfield · Accepted Answer

When using DynamoDB a global secondary indexes provides alternative methods to query data from a table.

Based on the tables you have described here is a structure that may work:

User Table

Hash Key: userId

Building Table

Hash Key: buildingId

Report Table

Hash Key: reportId
ReportUser GSI
- Hash Key: userId
BuildingUser GSI
- Hash Key: buildingId

The key to the above design are the global secondary indexes on the Report table. Unlike the hash key (and optional range key) on the main table the hash key (and optional range key) on a GSI do not have to be unique. This means you can query all of the reports submitted by a specific userId or all of the reports for a specific buildingId.

In real life these GSIs would probably want to include a Range key (such as date) to allow for ordering of the records when they are queried.

The other thing to remember about GSIs is that you need to choose what attributes are projected, able to be retrieved, as a GSI is actually a physical copy of the data. This also means the GSI is always updated asynchronously so reads are always eventually consistent.

NoSQL Structuring of Data

Answers (1)

Related Questions