Big Smile
Big Smile

Reputation: 1124

What exactly does Hash and Range keys mean in AWS DynamoDb?

I am new to DynamoDb and I did look at the AWS documentation and some of the questions regarding Hash and Range keys. After all that, I am still not sure what exactly they mean and why/how would you use them. Can someone give me a simple explanation with an example?

For example, If I want to create a Movie table with Name, Genre, Rating and DateReleased columns. What would be the correct way to create the DynamoDb table. In the example below, I have got some CloudFormation to try and create the following but I am not sure if I have used the KeySchema property correctly. Any help would be appreciated.

MovieTable:
    Type: AWS::DynamoDB::Table
    Properties:
      BillingMode: PAY_PER_REQUEST
      AttributeDefinitions:
        - AttributeName: "Name"
          AttributeType: "S"
        - AttributeName: "Genre"
          AttributeType: "S"
        - AttributeName: "Rating"
          AttributeType: "N"
        - AttributeName: "DateReleased"
          AttributeType: "S"
      KeySchema:
      - AttributeName: "Name"
        KeyType: "HASH"
      - AttributeName: "Genre"
        KeyType: "RANGE"
      - AttributeName: "Rating"
        KeyType: "RANGE"
      - AttributeName: "DateReleased"
        KeyType: "RANGE"
      TimeToLiveSpecification:
        AttributeName: ExpireAfter
        Enabled: false
      SSESpecification:
        SSEEnabled: true

Upvotes: 4

Views: 4822

Answers (2)

John Rotenstein
John Rotenstein

Reputation: 269276

There is an excellent online resource for DynamoDB: DynamoDB, explained.

From Anatomy of an Item | DynamoDB, explained:

When creating a new table, you will need to specify the primary key of that table. Every item in a table is uniquely identified by its primary key. Accordingly, the primary key must be included with every item that is written to a DynamoDB table.

There are two types of primary keys. A simple primary key uses a single attribute to identify an item, such as a Username or an OrderId. Using a DynamoDB table with a simple primary key is similar to using most simple key-value stores, such as Memcached.

A composite primary key uses a combination of two attributes to identify a particular item. The first attribute is a partition key (also known as a "hash key") which is used to segment and distribute items across shards. The second attribute is a sort key (also known as a "range key") which is used to order items with the same partition key. A DynamoDB table with a composite primary key can use interesting query patterns beyond simple get / set operations.

Understanding the primary key is a crucial part of planning your data model for a DynamoDB table. The primary key will be your main method of inserting and updating items in your table.

Basically:

  • The Partition Key (or Hash Key) decides where to put the item in storage. It is a unique identifier that enables items to be rapidly accessed, with guaranteed speeds. For your Movies table, it would be a unique name for the movie, such as The Matrix (1999), or perhaps a unique ID that is assigned to the movie.
  • The Sort Key (or Range Key) can optionally be used to extend the Partition Key to provide the 'unique' element. For example, the show might be The Simpsons, but there are many episodes, so the Sort Key can be used in conjunction with the Partition Key to give a unique name, such as The Simpsons: S02E12

A NoSQL database is very good at storing and retrieving data via the Primary Key (or Composite Key). It is not good at 'scanning' the database for information, since this is very inefficient. Thus, if information is often searched on non-primary fields (eg genre), it might be advantageous to create additional Secondary Indexes that can find items via other attributes. See: Secondary Indexes | DynamoDB, explained.

Upvotes: 3

Nghia Do
Nghia Do

Reputation: 2658

There are 2 cases here

Case 1: A Hash Key consists of a single attribute that uniquely identifies an item.

For examples:

a. if you are designing a table to store students data, you can define HashKey as Student_Id

b. if you are designing a table to store class data, you can define HashKey as Class_Id

Case 2: A Hash and Range Key consists of two attributes that together, uniquely identify an item

For examples:

a. if you are designing a table to store shopping cart, you can define HashKey as Product_Id and Range Key (Sort Key) as User_Id. In another word, the combination of Production_Id and User_Id must be unique across your table.

b. if you are designing a table to store music information, you can define HashKey as Artist and Range Key as 'Song Title'. In another word, the combination of Artist and 'Song Title' must be unique across your table.

Now, for your use case, you can define a Movie table as

HashKey: Movie Title

SortKey: Date Release

Genre: Attribute

Rating: Attribute

With this design, the assumption here is there is only 1 'Movie Title' is able to be released on 1 specific day. In another word, there is no chance there are multiple movies with the same title are released on the same day.

Hope you can define the cloudformation by now. If not, let us know, we will help.

Upvotes: 7

Related Questions