Reputation: 1124
I am new to DynamoDb and I did look at the AWS documentation and some of the questions regarding Hash and Range keys. After all that, I am still not sure what exactly they mean and why/how would you use them. Can someone give me a simple explanation with an example?
For example, If I want to create a Movie
table with Name
, Genre
, Rating
and DateReleased
columns. What would be the correct way to create the DynamoDb table. In the example below, I have got some CloudFormation to try and create the following but I am not sure if I have used the KeySchema property correctly. Any help would be appreciated.
MovieTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: "Name"
AttributeType: "S"
- AttributeName: "Genre"
AttributeType: "S"
- AttributeName: "Rating"
AttributeType: "N"
- AttributeName: "DateReleased"
AttributeType: "S"
KeySchema:
- AttributeName: "Name"
KeyType: "HASH"
- AttributeName: "Genre"
KeyType: "RANGE"
- AttributeName: "Rating"
KeyType: "RANGE"
- AttributeName: "DateReleased"
KeyType: "RANGE"
TimeToLiveSpecification:
AttributeName: ExpireAfter
Enabled: false
SSESpecification:
SSEEnabled: true
Upvotes: 4
Views: 4822
Reputation: 269276
There is an excellent online resource for DynamoDB: DynamoDB, explained.
From Anatomy of an Item | DynamoDB, explained:
When creating a new table, you will need to specify the primary key of that table. Every item in a table is uniquely identified by its primary key. Accordingly, the primary key must be included with every item that is written to a DynamoDB table.
There are two types of primary keys. A simple primary key uses a single attribute to identify an item, such as a Username or an OrderId. Using a DynamoDB table with a simple primary key is similar to using most simple key-value stores, such as Memcached.
A composite primary key uses a combination of two attributes to identify a particular item. The first attribute is a partition key (also known as a "hash key") which is used to segment and distribute items across shards. The second attribute is a sort key (also known as a "range key") which is used to order items with the same partition key. A DynamoDB table with a composite primary key can use interesting query patterns beyond simple get / set operations.
Understanding the primary key is a crucial part of planning your data model for a DynamoDB table. The primary key will be your main method of inserting and updating items in your table.
Basically:
The Matrix (1999)
, or perhaps a unique ID that is assigned to the movie.The Simpsons
, but there are many episodes, so the Sort Key can be used in conjunction with the Partition Key to give a unique name, such as The Simpsons: S02E12
A NoSQL database is very good at storing and retrieving data via the Primary Key (or Composite Key). It is not good at 'scanning' the database for information, since this is very inefficient. Thus, if information is often searched on non-primary fields (eg genre), it might be advantageous to create additional Secondary Indexes that can find items via other attributes. See: Secondary Indexes | DynamoDB, explained.
Upvotes: 3
Reputation: 2658
There are 2 cases here
Case 1: A Hash Key consists of a single attribute that uniquely identifies an item.
For examples:
a. if you are designing a table to store students data, you can define HashKey as Student_Id
b. if you are designing a table to store class data, you can define HashKey as Class_Id
Case 2: A Hash and Range Key consists of two attributes that together, uniquely identify an item
For examples:
a. if you are designing a table to store shopping cart, you can define HashKey as Product_Id and Range Key (Sort Key) as User_Id. In another word, the combination of Production_Id and User_Id must be unique across your table.
b. if you are designing a table to store music information, you can define HashKey as Artist and Range Key as 'Song Title'. In another word, the combination of Artist and 'Song Title' must be unique across your table.
Now, for your use case, you can define a Movie table as
HashKey: Movie Title
SortKey: Date Release
Genre: Attribute
Rating: Attribute
With this design, the assumption here is there is only 1 'Movie Title' is able to be released on 1 specific day. In another word, there is no chance there are multiple movies with the same title are released on the same day.
Hope you can define the cloudformation by now. If not, let us know, we will help.
Upvotes: 7