Reputation: 2145
I am trying to query Cosmos DB via SQL API. I have a straightforward data model with different types of documents within the same container (since they're related to each other) and I'm using a custom type
key for each document that defines the document type following this tutorial.
Here's how my documents look like:
document a:
{
"id": 1,
"b_id": 2,
"c_id": 3,
"foo": "bar",
"type": "a"
}
document b:
{
"id": 2,
"baz": "qux",
"type": "b"
}
document c:
{
"id": 3,
"quux": "quuz",
"type": "c"
}
I know that the join
synatx is different in Cosmos DB and only works within the same document. I am trying to find a way to retrieve document a
with all its attributes retrieved from document b
and c
:
{
"id": 1,
"foo": "bar",
"baz": "qux",
"quux": "quuz"
}
Is it possible in an efficient way? If not, I'm considering to normalize data before writing to Cosmos. Something like:
{
"id": 1,
"b" : { "baz": "qux" },
"c" : { "quux": "quuz" }
}
But this way I have to write the entire b
and c
in each document while it seems having a reference to them makes it more efficient from capacity viewpoint.
Upvotes: 2
Views: 1568
Reputation: 8763
Your data model here is not really correct for the type of data you have here. You typically only reference data when there is a many-to-many relationship. Even then you only store one relationship per document. The only way to retrieve related data using your data model here is to query the container twice.
What you have in your data here is a one-to-many (or one-to-few) where you have references to multiple documents from a single document.
In those scenarios you can or should embed the child items if the relationship is one-to-few. If it is a rather large or unbounded one-to-many, then you should model the data such that you can retrieve all related items with a common property and value.
In your scenario here, if this is just a small number of child documents, I would embed them. Then if you've modeled your data such that you know both the partition key and id for your data, you can very efficiently retrieve this data using a point read ReadItemAsync()
rather than using a query. This is both super fast and very efficient.
Upvotes: 1