Reputation: 1
I'm developing the following rest services:
GET /parents
- Get all parents
GET /parents/{id}
- Get parent by id
POST /parents
- Add new parent
GET /parents/{pid}/children
- Get all children for parent pid
GET /parents/{pid}/children/{id}
- Get child by id for parent pid
POST /parents/{pid}/children
- Add new child for parent pid
GET /parents/{pid}/children/{cid}/subchildren
- Get all subchildren for parent pid and child cid
GET /parents/{pid}/children/{cid}/subchildren{id}
- Get subchildchild by id for parent pid and child cid
POST /parents/{pid}/children/{cid}/subchildren
- Add new subchild for parent pid and child cid
I have to use DynamoDB (or some other serverless and cheap database option on AWS) for performing CRUD operations (Update and Delete are not a priority as of now).
When GET /parents
is called, the response will be:
[
{
"id": 1,
"name": "parent1",
"children": [
{
"id": 1,
"name": "parent1_child1",
"subchildren": [
{
"id": 1,
"name": "parent1_child1_subchild1"
},
{
"id": 2,
"name": "parent1_child1_subchild2"
}
]
},
{
"id": 2,
"name": "parent1_child2",
"subchildren": [
{
"id": 1,
"name": "parent1_child2_subchild1"
},
{
"id": 2,
"name": "parent1_child2_subchild2"
},
{
"id": 3,
"name": "parent1_child2_subchild3"
}
]
}
]
},
{
"id": 2,
"name": "parent2",
"children": [
{
"id": 1,
"name": "parent2_child1",
"subchildren": [
{
"id": 1,
"name": "parent2_child1_subchild1"
}
]
},
{
"id": 2,
"name": "parent2_child2",
"subchildren": [
{
"id": 1,
"name": "parent2_child2_subchild1"
},
{
"id": 2,
"name": "parent2_child2_subchild2"
},
{
"id": 3,
"name": "parent2_child2_subchild3"
},
{
"id": 4,
"name": "parent2_child2_subchild4"
}
]
}
]
}
]
How should I model the data in DynamoDB? I am using spring-data-dynamodb for performing storing and retrieving options.
What is the best design in this case to create model classes?
Currently I have created different tables as mentioned in https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SampleData.CreateTables.html#SampleData.CreateTables2
Upvotes: 0
Views: 3417
Reputation: 790
You can do several things:
A: better for cost/performance but requires a bit of extra work
In your table, create a partition key PK
and order key OK
as both strings (you can use whatever name you want for the keys of course, this is just an example - my point being, the keys are not in your data model, you have to create them upon insert). When you create your items, don't store them as a one big document but store parents, children and subchildren separately. Each item gets as PK
the value parent
/child
/subchild
, and as OK
the value that you have in each item's name
as it has the right parentname_childname_subchildname kind of inheritance, or you can construct it from each item's parents properties.
Upon retrieval, you can get all subchildren of a given parent for example with an expression such as where PK = subchild and OK begins_with parentname_
which effectively returns all items that are subchilds and that belong to a given parent, no matter what the child is. Il you want to also filter on the child, you can do begins with parentname_childname_
. If you want only the children of a parent, you can do where PK = child and OK begins_with parentname_
. If you know precisely what child you want, instead of using begins_with
you can simply do where PK = child and OK = parentname_childname
. And so forth... (I'm using SQL-like syntax for ease of writing/reading but it's easy enough to convert to dynamodb query language).
The idea is too differentiate your item's actual attributes and how you retrieve your item. You can create completely bogus keys if it helps you find your items easily!
You may not event need to differentiate the child "levels" in the PK, and you could as well use just a generic value there for similar results. The main objective of this method is to have an order key to use begins_with
on in a particular partition key.
The main problem with this method is that once you have all your parents/children/subchildren as in your /parents
example, you will have to reconstruct the tree entirely in code. But as far as dynamodb is concerned, this is probably the most optimized solution.
B: easier option using filters, but (much) worse cost/performance
Store each parent in a big document then scan your table with a filter like where item.children[].subchildren[].name = parentname_childname_subchildname
. It's not quite as efficient and costs much more, as filters are applied on the results of the query so you are retrieving everything first then filtering only what you want, so you pay for all the items first then cut out those you don't want. It works but it's not great, especially if you want something like 5 items out of 100000.
Main advantage is that for /parents
call, you get the whole hierarchy as described. But for subroutes /parents/x/children/...
you will need to either reformat the object in your code or query only sub attributes in the projection expression.
C: using local secondary indexes
A bit like solution A where you store items separately, with the same PK for all items but with a local secondary index with 3 different order keys: name_parent
, name_child
, name_subchild
. Then you can query each index separately depending on what you want. For this use case it's better to use LSIs than GSIs for cost optimization, but you are free to implement the same idea with GSIs. It will work just as well (with the same limitations about reformatting parent elements to contain children elements as well).
Upvotes: 3