Reputation: 4823
For representing many-to-many relationship in Dynamo DB, I've commonly seen two approaches: Global Secondary Index (GSI) and Adjacency list. Now my question is, when to use which one?
The use of GSI basically states to flip the partition key and sort key around, so that the data can be efficiently queried in both cases. Examples show something like Online Game with Players, such as
Players table
--------------
Partition | Sort
-----------------
Player 1 | Game 1
Player 1 | Game 2
Player 2 | Game 1
Player 3 | Game 2
Games GSI
-----------
Partition | Sort
-----------------
Game 1 | Player 2
Game 1 | Player 2
Game 2 | Player 1
Game 2 | Player 3
I'm making the assumptions that these are all sessions within the same game platform, i.e. matches with finite amount of players.
This all seems straight forward and logical to implement... Until the data gets a bit more complex. What if both Players and Games have their own set of attributes? Let's say a Game has attribute for when it was started and Player has attributes such as username and personal game score. How do these get projected on each, Table and GSI?
For example, the projections required would be something like these
Get players participating a game
// query made with game id
{
start_date: '2018-11-04T13:00Z',
status: 'IN_PROGRESS',
players: [
{
username: 'starkshark',
points: 200
},
{
username: 'coldshot',
points 300
}
]
}
Get games that a player participated
// query made with player id
{
username: 'starkshark',
games: [
{
status: 'IN_PROGRESS',
start_date: '....'
},
{
status: 'ENDED',
start_date: '...',
end_date: '...'
}
]
}
Or is this the borderline case when one needs to use the adjacency list pattern? From what I've read in general about adjacency lists, it seems quite a complex to implement simple many-to-many relationship as with the online gaming example above. What's I've understood, it's meant to model graphs with multiple nodes linking to each other. Of course, in this case the nodes would be Games and Players (and probably any other entity needed in the model)
TLDR: So it boils down to the final questions, when having many-to-many relationship between entities that have their own set of attributes, is adjacency list the option to look for or is there a less complex data structure for the model?
Upvotes: 0
Views: 729
Reputation: 7669
When an entity has its own set of attributes, you could model it in your adjacency list as an edge that loops back on itself.
Id1 | Id2 | Data
- - - - - - - - - - - - - - - - - - -
User1 | User1 | {email=...}
Alternately, you can model an attribute as an edge from the entity to the attribute type.
Id1 | Id2 | Data
- - - - - - - - - - - - - - - - - - -
User1 | Email | bob@...
It all depends on how you want to be able to query your data. The advantage of the second approach is that you could have a GSI where Id2
is the partition key, and Data
is the range key, which would allow you to efficiently find all users which match a certain attribute value.
As for GSIs, the adjacency pattern typically has a GSI which contains the same primary key attributes as the table, but with the order transposed, so that you can easily query your adjacency list in either direction.
Upvotes: 3